TY - GEN
T1 - Data classification with a generalized Gaussian components based density estimation algorithm
AU - Hsieh, Chih Hung
AU - Chang, Darby Tien Hao
AU - Oyang, Yen Jen
PY - 2009/11/18
Y1 - 2009/11/18
N2 - Data classification is an intensively studied machine learning problem and there are two major categories of data classification algorithms, namely the logic based and the kernel based. The logic based classifiers, such as the decision tree and the rule-based classifier, feature the advantage of presenting a good summary about the distinctive characteristics of different classes of data. On the other hand, the kernel based classifiers, such as the neural network and the support vector machine (SVM), typically can deliver higher prediction accuracy than the logic based classifiers. However, the user of a kernel based classifier normally cannot get an overall picture about the distribution of the data set. For some applications, the overall picture of the distribution of the data set can provide valuable insights about the distinctive characteristics of different classes of data and therefore is highly desirable. In this article, aiming to close the gap between the logic based classifiers and the kernel based classifiers, we propose a novel approach to carry out density estimation based on a mixture model composed of a limited number of generalized Gaussian components. One favorite feature of the classifier constructed with the proposed approach is that a user can easily obtain an overall picture of the distributions of the data set by examining the eigenvectors and eigenvalues of the covariance matrices associated with the generalized Gaussian components. Experimental results show that the classifier constructed with the proposed approach is capable of delivering superior prediction accuracy in comparison with the conventional logic based classifiers and the EM (Expectation Maximization) based classifier. On the other hand, though it cannot match the prediction accuracy delivered by the SVM, the proposed classifier enjoys one major advantage due to providing the user with an overall picture of the underlying distributions.
AB - Data classification is an intensively studied machine learning problem and there are two major categories of data classification algorithms, namely the logic based and the kernel based. The logic based classifiers, such as the decision tree and the rule-based classifier, feature the advantage of presenting a good summary about the distinctive characteristics of different classes of data. On the other hand, the kernel based classifiers, such as the neural network and the support vector machine (SVM), typically can deliver higher prediction accuracy than the logic based classifiers. However, the user of a kernel based classifier normally cannot get an overall picture about the distribution of the data set. For some applications, the overall picture of the distribution of the data set can provide valuable insights about the distinctive characteristics of different classes of data and therefore is highly desirable. In this article, aiming to close the gap between the logic based classifiers and the kernel based classifiers, we propose a novel approach to carry out density estimation based on a mixture model composed of a limited number of generalized Gaussian components. One favorite feature of the classifier constructed with the proposed approach is that a user can easily obtain an overall picture of the distributions of the data set by examining the eigenvectors and eigenvalues of the covariance matrices associated with the generalized Gaussian components. Experimental results show that the classifier constructed with the proposed approach is capable of delivering superior prediction accuracy in comparison with the conventional logic based classifiers and the EM (Expectation Maximization) based classifier. On the other hand, though it cannot match the prediction accuracy delivered by the SVM, the proposed classifier enjoys one major advantage due to providing the user with an overall picture of the underlying distributions.
UR - http://www.scopus.com/inward/record.url?scp=70449337166&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449337166&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2009.5179000
DO - 10.1109/IJCNN.2009.5179000
M3 - Conference contribution
AN - SCOPUS:70449337166
SN - 9781424435531
T3 - Proceedings of the International Joint Conference on Neural Networks
SP - 1259
EP - 1266
BT - 2009 International Joint Conference on Neural Networks, IJCNN 2009
T2 - 2009 International Joint Conference on Neural Networks, IJCNN 2009
Y2 - 14 June 2009 through 19 June 2009
ER -