Install LightGBM and CatBoost on Ubuntu 22.04

Introduction

When someone starts with Machine Learning he usually starts to build some simple models as logistic regression, naive Bayes, linear regression etc. And those alone are already enough for most use cases, as their simplicity is productivity-friendly and comes up with adequate accuracy. However, in enterprise level, accuracy can be important for a lot of reasons. Gradient Boosting Machines are some algorithms which outperform the aforementioned methods and are not complex enough to use them. Of course, before we build the model with (e.g. tidymodels) we have to install them.

Thus, on this article I gather all that information.

Installation Guides	Source
LightGBM	Link
CatBoost	Link
XGBoost	Link

Show the code

library(highcharter)
library(gtrendsR)
library(dplyr)
googleTrendsData = gtrendsR::gtrends(keyword = c("LightGBM", "CatBoost", "XGBoost"), gprop = "web", onlyInterest = TRUE)

interestOverTime = googleTrendsData[["interest_over_time"]] %>%
  dplyr::mutate(date = lubridate::ymd(date)) %>%
  dplyr::mutate(Year = lubridate::year(date)) %>%
  select(Year, keyword, hits) %>%
  group_by(Year, keyword) %>%
  summarise(Average = round(mean(hits), digits = 1))
  

highchart() %>%
    hc_chart(type = "line") %>%
    hc_title(text = "Search Interest of various GBMs") %>%
    hc_xAxis(categories = unique(interestOverTime$Year)) %>%
    hc_yAxis(title = list(text = "Average")) %>%
    hc_add_series(
        name = "XGBoost",
        data = interestOverTime %>% filter(keyword == "XGBoost") %>% pull(Average)
    ) %>%
    hc_add_series(
        name = "CatBoost",
        data = interestOverTime %>% filter(keyword == "CatBoost") %>% pull(Average)
    ) %>%
    hc_add_series(
        name = "LightGBM",
        data = interestOverTime %>% filter(keyword == "LightGBM") %>% pull(Average)
    )

LightGBM

Option 1. Install R Package

If you are reading this blog, the most possible scenario in that you are using R too. The most easy way to install the corresponding R package :

R code

start_time_lightgbm <- Sys.time()
install.packages("lightgbm", repos = "https://cran.r-project.org")
end_time_lightgbm <- Sys.time()

Option 2. CMAKE

The LightGBM documentation are referring to this method of installation.

Terminal

sudo apt install cmake

Terminal

git clone --recursive https://github.com/microsoft/LightGBM
cd LightGBM
mkdir build
cd build
cmake ..
make -j4

CatBoost

Their realeases.

R code

install.packages("devtools")

On my occassion, when I tried to install devtools had an error status. According to my error status I had to add packages libharfbuzz-dev and libfribidi-dev. After that, my devtools installation completed without errors.

R code

start_time_catboost <- Sys.time()
devtools::install_url("https://github.com/catboost/catboost/releases/download/v1.1.1/catboost-R-Linux-1.1.1.tgz"[, INSTALL_opts = c("--no-multiarch", "--no-test-load")])
end_time_catboost <- Sys.time()

XGBoost

R code

start_time_xgboost <- Sys.time()
install.packages("xgboost")
end_time_xgboost <- Sys.time()

Summary

ML Model	Method	Installation time
LightGBM	R package	7.79 min.
CatBoost	R package (w/o devtools)	2.1 min.
XGBoost	R package	6.16 min.

Citation

BibTeX citation:

@online{2022,
  author = {, stesiam},
  title = {Install {LightGBM} and {CatBoost} on {Ubuntu} 22.04},
  date = {2022-11-13},
  url = {https://stesiam.com/posts/install-gbm-in-ubuntu/},
  langid = {en}
}

For attribution, please cite this work as:

stesiam. (2022, November 13). Install LightGBM and CatBoost on Ubuntu 22.04. Retrieved from https://stesiam.com/posts/install-gbm-in-ubuntu/