TY - JOUR
T1 - Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem
AU - Huang, Yueh Min
AU - Hung, Chun Min
AU - Jiau, Hewijin Christine
N1 - Funding Information:
The authors would like to thank the Tainan Business Bank at the Republic of China for financially supporting this study.
PY - 2006/9
Y1 - 2006/9
N2 - Most of the real-world data that are analyzed using nonlinear classification techniques are imbalanced in terms of the proportion of examples available for each class. This problem of imbalanced class distributions can lead the algorithms to learn overly complex models that overfit the data and have little relevance. Our study analyzes different classification algorithms that were employed to predict the creditworthiness of a bank's customers based on checking account information. A series of experiments were conducted to test the different techniques. The objective is to determine a range of credit scores that could be implemented by a manager for risk management. As a result, by realizing the concept of classification with equal quantities, the implicit knowledge can be discovered successfully. Subsequently, a strategy of data cleaning for handling such a real case with imbalanced distribution data is then proposed.
AB - Most of the real-world data that are analyzed using nonlinear classification techniques are imbalanced in terms of the proportion of examples available for each class. This problem of imbalanced class distributions can lead the algorithms to learn overly complex models that overfit the data and have little relevance. Our study analyzes different classification algorithms that were employed to predict the creditworthiness of a bank's customers based on checking account information. A series of experiments were conducted to test the different techniques. The objective is to determine a range of credit scores that could be implemented by a manager for risk management. As a result, by realizing the concept of classification with equal quantities, the implicit knowledge can be discovered successfully. Subsequently, a strategy of data cleaning for handling such a real case with imbalanced distribution data is then proposed.
UR - http://www.scopus.com/inward/record.url?scp=33646142788&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33646142788&partnerID=8YFLogxK
U2 - 10.1016/j.nonrwa.2005.04.006
DO - 10.1016/j.nonrwa.2005.04.006
M3 - Article
AN - SCOPUS:33646142788
SN - 1468-1218
VL - 7
SP - 720
EP - 747
JO - Nonlinear Analysis: Real World Applications
JF - Nonlinear Analysis: Real World Applications
IS - 4
ER -