Prediction on Loan Defaults: Tree-based Approach

Authors

  • Balqis Adzman Level 10, Tower RHB Centre, Jalan Tun Razak, 50400 Kuala Lumpur, Malaysia.
  • Sayang Mohd Deni School of Mathematical Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA (UiTM) Shah Alam, 40450 Selangor, Malaysia.
  • Mohamad Ismeth Khan Azhar Suhaimi School of Mathematical Sciences, College of Computing, Informatics and Mathematics, Universiti Teknologi MARA (UiTM) Shah Alam, 40450 Selangor, Malaysia.

Abstract

Financial institutions have been exploring the application of machine learning approaches due to its exceptional performance as well as overwhelming exposure, especially in predicting the repayment ability of their customers. The ability of machine learning methods in dealing with big and more complicated data structure has made it favourable as the financial data is often very large and complex in nature. Thus, this study adopts two tree-based machine learning approaches to predict the loan defaulters, namely random forest and extreme gradient boosting (XGBoost). However, due to its sensitivity towards imbalanced dataset, this study has addressed this issue beforehand. The performance of both approaches was assessed by computing the accuracy, precision, recall, F-1 score, ROC as well as AUC. XGBoost proves to be able to outperform the traditional machine learning model, random forest, with 61.77% accuracy, other than it generally takes lower computation time. The model is able to report higher value for all the assessment matrices used. Other than that, this study also focuses on the customers’ demographic information and found that it was useful in predicting their repayment ability especially the customer’s length of service, education level as well as age.

Keywords:

Loan Defaults Prediction, XGBoost, Random Forest, Imbalanced Dataset, Machine Learning Model

Downloads

Published

2023-07-31

How to Cite

Balqis Adzman, Sayang Mohd Deni, & Mohamad Ismeth Khan Azhar Suhaimi. (2023). Prediction on Loan Defaults: Tree-based Approach . Applied Mathematics and Computational Intelligence (AMCI), 12(2), 36–47. Retrieved from https://ejournal.unimap.edu.my/index.php/amci/article/view/236