Credit Risk Prediction Model Using Feature Engineering and Machine Learning Techniques

Main Article Content

Chonlada Muangthanang
Surasak Mungsing
Nivet Chirawichitcha

Abstract

Credit scoring is a crucial step in the risk management process of the financial industry and commercial banks. The objective of this research is the development of a credit risk prediction model using feature engineering and machine learning techniques. This research was used to test the algorithm with a peer-to-peer (P2P) lending dataset and measure performance with classification accuracy. The experiment in this research found the XGB algorithm provided the most effective classification accuracy of 88.94%, which is better than other classifiers. Therefore, the proposed
research framework of this research, working with feature engineering, feature selection, and machine learning techniques, is suitable and effective for credit scoring problem analysis.

Article Details

How to Cite
Muangthanang, C., Mungsing, . S., & Chirawichitcha, N. . (2024). Credit Risk Prediction Model Using Feature Engineering and Machine Learning Techniques. INTERNATIONAL SCIENTIFIC JOURNAL OF ENGINEERING AND TECHNOLOGY (ISJET), 8(1), 19–26. Retrieved from https://ph02.tci-thaijo.org/index.php/isjet/article/view/249949
Section
Research Article

References

H. Wang, K. Chen, W. Zhu et al., “A Process Model on P2P Lending,” Financial Innovation, vol. 1, no. 3, pp. 2-9, Jun. 2015.

M. L. Challa, V. Malepati, and S. N.R. Kolusu, “Forecasting Risk Using Auto Regressive Integrated Moving Average Approach,” Financial Innovation, vol. 4, no. 24, pp. 1-17, Oct. 2018.

M. Malekipirbazari and V. Aksakalli, “Risk Assessment in Social Lending Via Random Forests,” Expert Systems with Applications, vol. 42, no. 10, pp. 4621-4631, Jun. 2015.

D. Selvamuthu, V. Kumar, and A. Mishra, “Indian Stock Market Prediction UsingArtificial Neural Networks on Tick Data,” Financial Innovation, vol. 5, no. 16, pp. 1-12, Mar. 2019.

X. Zhong and D. Enke, “Predicting the Daily Return Direction of the Stock Market Using Hybrid Machine Learning

Algorithms,” Financial Innovation, vol. 5, no. 16, pp. 1-20, Dec. 2019.

X. Ye, L. Dong, and D. Ma, “Loan Evaluation in P2P Lending Based on Random Forest Optimized by Genetic Algorithm with Profit Score,” Electronic Commerce Research and Applications, vol. 32, pp. 23-36, Nov. 2018.

C. Serrano-Cinca, B. Gutiérrez-Nieto, and L. López-Palacios, “Determinants of Default in P2P Lending,” PLOS ONE, vol. 10, no. 10, pp. 1-22, Oct. 2015.

X .Yao, J. Crook, and G. Andreeva, “Support Vector Regression for Loss Given Default Modelling,” European Journal of Operational Research, vol. 240, no. 2, pp. 528-538, Oct. 2015.

R. Emekter, Y. Tu, B. Jirasakuldech et al., “ Evaluating Credit Risk and Loan Performance in Online Peer-To-Peer Lending,” Applied Economics, vol. 47, pp. 54-70, Oct. 2014.

A. Bagherpour, Predicting Mortgage Loan Default with Machine Learning Methods. Berkeley, CA: University of California Riverside, 2017, pp. 1-29.

A. Byanjankar, M. Heikkilä, and J. Mezei, “Predicting Credit Risk in Peer-To-Peer Lending: A Neural Network Approach,” in Proc. IEEE Symposium Series on Computational Intelligence, 2015, pp. 719-725.

A. Cao, H. He, Z. Chen et al., “Performance Evaluation of Machine Learning Approaches for Credit Scoring,” International Journal of Economics Finance and Management Science, vol. 6, no. 6, pp. 255-260, Dec. 2018.

H. Kvamme, N. Sellereite, K. Aas et al., “Predicting Mortgage Default Using Convolutional Neural Networks,” Expert Systems with Applications, vol. 102, pp. 207-217, Jul. 2018.

A. Kim and S. B. Cho, “An Ensemble Semi-Supervised Learning Method for Predicting Defaults in Social Lending,” Engineering Applications of Artificial Intelligence, vol. 81, pp. 193-199, May. 2019.

Y. Tang, A. Moro, S. Sozzo et al., “Modelling Trust Evolution within Small Business Lending Relationships,” Financial Innovation, vol. 4, no. 19, pp.1-18, Sep. 2018.

S. Moradi and F. M. Rafiei, “A Dynamic Credit Risk Assessment Model with Data Mining Techniques: Evidence from Iranian Banks,” Financial Innovation, vol. 5, no. 15, pp.1-27, Mar. 2019.

I. Brown and C. Mues, “An Experimental Comparison of Classification Algorithms for Imbalanced Credit Scoring Data Sets,” Expert Systems with Applications, vol. 39, no. 3, pp. 3346-3453, Feb. 2012.

Y. W. Li, “Research on Credit Score of P2P Online Lending,” Advances in Social Science, Education and Humanities Research (ASSEHR), vol. 181, pp. 883-886. Sep. 2018.

T. Zhang, W. Zhang, W. Xu et al., “Multiple Instance Learning for Credit Risk Assessment with Transaction Data,” KnowledgeBased Systems, vol. 161, no. 1, pp. 65-77, Dec. 2018.

V. B. Djeundje and J. Crook, “Identifying Hidden Patterns in Credit Risk Survival Data Using Generalised Additive Models,” European Journal of Operational Research, vol. 277, no. 1, pp. 366-376, Aug. 2019.

Y. Chen, J. Leu, S. Huang et al., “Predicting Default Risk on Peer-to-Peer Lending Imbalanced Datasets,” IEEE Access, vol. 9, pp. 73103-73109, May. 2021.

K. Masmoudi, L. Abid, and A. Masmoudi, “Credit Risk Modeling Using Bayesian Network with a Latent Variable,” Expert Systems with Applications, vol. 127, pp. 157-166, Aug. 2019.

M. Papouskova and P. Hajek, “Two-Stage Consumer Credit Risk Modelling Using Heterogeneous Ensemble learning,” Decision Support Systems, vol. 118, pp. 33-45, Mar. 2019.

X. Ma, J. Sha, D. Wang et al., “Study on a Prediction of P2p Network Loan Default Based on the Machine Learning Lightgbm and Xgboost Algorithms According to Different High Dimensional Data Cleaning,” Electronic Commerce Research and Applications, vol. 31, pp. 24-39, Sep. 2018.

A. Coser, M. Maer-matei, and C. Albu, “Predictive Models for Loan Default Risk Assessment,” Economic Computation and Economic Cybernetics Studies and Research, vol. 53, no. 2, pp. 149-165, Apr. 2019.

P. Cho, W. Chang, and J. W. Song, “Application of InstanceBased Entropy Fuzzy Support Vector Machine in Peerto-Peer Lending Investment Decision,” IEEE Access, vol. 7, pp. 16925-16939, Feb. 2019.

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, Oct. 2001.

N. Chirawichitchai, “Developing Term Weighting Scheme Based on Term Occurrence Ratio for Sentiment Analysis,” Information Science and Applications Lecture Notes in Electrical Engineering, vol. 339, pp. 737-744, Jan. 2015.

N. Chirawichitchai, “Emotion Classification of Thai Text Based Using Term Weighting and Machine Learning Techniques,” in Proc. The 11th The International Joint Conference on Computer Science and Software Engineering, 2014, pp. 91-96.

N. Chirawichitchai, “Sentiment Classification by a Hybrid Method of Greedy Search and Multinomial Naïve Bayes Algorithm,” in Proc. International Conference on ICT and Knowledge Engineering ICT & Knowledge Engineering, 2013, pp. 1-4.