A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets

Der Chiang Li, Chiao Wen Liu, Susan C. Hu

研究成果: Article同行評審

128 引文 斯高帕斯(Scopus)

摘要

Objective: Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. Methods: This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. Results: This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. Conclusion: This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches.

原文English
頁(從 - 到)45-52
頁數8
期刊Artificial Intelligence in Medicine
52
發行號1
DOIs
出版狀態Published - 2011 5月

All Science Journal Classification (ASJC) codes

  • 醫藥(雜項)
  • 人工智慧

指紋

深入研究「A fuzzy-based data transformation for feature extraction to increase classification performance with small medical data sets」主題。共同形成了獨特的指紋。

引用此