Application of Gradient Boosting Algorithm In Statistical Modelling

Hamid A Adamu; Murtala Muhammad; Abdullahi Mohammed Jingi

Research Article Open Access

Abstract

Gradient Boosting (GB) is a machine learning technique for regression, which produces more accurate prediction models in form of ensemble weak prediction models. It is an iterative algorithm that combines simple parameterized functions with weak performance (high prediction error) in order to produce a highly accurate prediction by minimizing the errors [1]. Therefore, this paper investigates the application of gradient boosting algorithm in Generalised Linear Model (GLM) and Generalised Additive Models (GAM) to produce better prediction using Munich rental data. More interestingly, to compare the performance of classical GLM and GAM and their corresponding boosted packages in prediction. However, in boosting algorithm, optimum-boosting iterations are highly recommended to avoid over fitting. It plays an important role when the fitting model, we, therefore, employ the use of the Akaike Information Criterion (AIC) based technique to determine the appropriate boosting iteration that gives the optimum prediction. We applied the AIC and Cross-validation (CV) techniques to determine the optimum boosting iterations. The results obtained are then compared to investigate the algorithm that is more accurate. It is noticed that by default, the gamboost (boosted GAM) fits models using smooth baselearners (bbs). Similarly, it also noted that the coefficients of the fitted model will be in matrix form if smooth base-learners are used while they are just linear if linear base-learners are used.

Hamid A Adamu, Murtala Muhammad and Abdullahi Mohammed Jingi

To read the full article Download Full Article