TY - JOUR
T1 - Breast cancer–detection system using PCA, multilayer perceptron, transfer learning, and support vector machine
AU - Chiu, Huan Jung
AU - Li, Tzuu Hseng S.
AU - Kuo, Ping Huan
N1 - Funding Information:
This work was supported by the Ministry of Science and Technology, Taiwan, under Grant MOST 106-2218-E-153-001-MY3 and Grant MOST 106-2221-E-006-009-MY3.
Publisher Copyright:
© 2020 Institute of Electrical and Electronics Engineers Inc.. All rights reserved.
PY - 2020
Y1 - 2020
N2 - This study proposed a new processing method to predict breast cancer on the basis of nine individual attributes, including age, body mass index, glucose, insulin, and a homeostasis model assessment. First, principal component analysis (PCA) was used to identify valuable parts of the data and further reduce the dimensions of the data. The cumulative proportion of the top five major components was 99.89%. The multilayer perceptron network (MLP) method was then used to extract characteristics included in the data, and the structure of the network was designed for the exploration of how data developed as the dimensions increased or decreased. As such, the model was established to first explore (high dimensional) and then develop (low dimensional) data. After training and learning, the models could segregate the representative attributes and numbers, and the characteristic data were then used as classifiers through transfer learning techniques using support vector machines. To verify the proposed method, the experiment performed k-fold cross-validation 50 times on average. Experimental results verified the proposed method with 10-fold cross-validation using the dataset of Manuel Gomes from the University Hospital Centre of Coimbra, and an accuracy of 86.97% was achieved. The results indicate that the proposed series of processes and methods can effectively and powerfully examine the incidence of breast cancer. Furthermore, the data processed using only the PCA method as well as the characteristics extracted through the PCA method then combined with MLP after learning were analyzed. The differences displayed for the visual technique characteristics of the t-distributed stochastic neighbor embedding were compared.
AB - This study proposed a new processing method to predict breast cancer on the basis of nine individual attributes, including age, body mass index, glucose, insulin, and a homeostasis model assessment. First, principal component analysis (PCA) was used to identify valuable parts of the data and further reduce the dimensions of the data. The cumulative proportion of the top five major components was 99.89%. The multilayer perceptron network (MLP) method was then used to extract characteristics included in the data, and the structure of the network was designed for the exploration of how data developed as the dimensions increased or decreased. As such, the model was established to first explore (high dimensional) and then develop (low dimensional) data. After training and learning, the models could segregate the representative attributes and numbers, and the characteristic data were then used as classifiers through transfer learning techniques using support vector machines. To verify the proposed method, the experiment performed k-fold cross-validation 50 times on average. Experimental results verified the proposed method with 10-fold cross-validation using the dataset of Manuel Gomes from the University Hospital Centre of Coimbra, and an accuracy of 86.97% was achieved. The results indicate that the proposed series of processes and methods can effectively and powerfully examine the incidence of breast cancer. Furthermore, the data processed using only the PCA method as well as the characteristics extracted through the PCA method then combined with MLP after learning were analyzed. The differences displayed for the visual technique characteristics of the t-distributed stochastic neighbor embedding were compared.
UR - http://www.scopus.com/inward/record.url?scp=85102410330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102410330&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2020.3036912
DO - 10.1109/ACCESS.2020.3036912
M3 - Article
AN - SCOPUS:85102410330
SN - 2169-3536
VL - 8
SP - 204309
EP - 204324
JO - IEEE Access
JF - IEEE Access
ER -