Credit Scoring: A Comparison of Machine Learning Models and Their Modifications

Jia Chong Ong; Lai Soon Lee

doi:10.58915/amci.v14i1.1362

Authors

Jia Chong Ong Universiti Putra Malaysia https://orcid.org/0009-0002-1882-2320
Lai Soon Lee Universiti Putra Malaysia

DOI:

https://doi.org/10.58915/amci.v14i1.1362

Keywords:

classification, comparative analysis, credit scoring, machine learning, modification techniques

Abstract

This study compares the performance of various machine learning models and their modifications across four benchmark credit scoring datasets to address the absence of comprehensive comparative analyses on multiple combinations of modifications in the credit scoring domain. Models studied include Logistic Regression (LR), Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Random Forest (RF), and Multilayer Perceptron (MLP). Starting from these base models, a series of modiΫications encompassing feature scaling, resampling, feature selection, and hyperparameter tuning are added phase by phase to the previous models, where the optimal method from each modification is determined in each phase based on the accuracy, F1 score, precision, recall, area under the Receiver Operating Characteristic curve, fitting time and prediction time. Findings reveal LR’s suitability for small datasets, while RF and MLP excel in larger ones. Standardization and Min‐Max Scaling are generally effective, with Max‐Abs Scaling enhancing RF. Synthetic Minority Oversampling Technique proves optimal for imbalanced datasets but no resampling is necessary for small balanced datasets. Analysis of Variance and Mutual Information perform similarly without tuning, while Grid Search slightly outperforms Random Search disregarding runtimes. The study concludes by presenting optimal models and alternatives.