TY - JOUR
T1 - Kernel mixture model for probability density estimation in Bayesian classifiers
AU - Zhang, Wenyu
AU - Zhang, Zhenjiang
AU - Chao, Han Chieh
AU - Tseng, Fan Hsun
N1 - Funding Information:
Acknowledgements This work is supported by National Natural Science Foundation of China under Grant 61772064, and Academic Discipline, Post-Graduate Education Project of the Beijing Municipal Commission of Education, and Fundamental Research Funds for the Central Universities under Grant 2017YJS026. The authors also thanks the anonymous reviewers’ valuable comments and suggestions for improving the quality of this paper.
Publisher Copyright:
© 2018, The Author(s).
PY - 2018/5/1
Y1 - 2018/5/1
N2 - Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures follows a given distribution, which may not be satisfied for real world data. This paper presents a non-parametric kernel mixture model (KMM) based probability density estimation approach, in which the data sample of a class is assumed to be drawn by several unknown independent hidden subclasses. Unlike traditional FMM schemes, we simply use the k-means clustering algorithm to partition the data sample into several independent components, and the regional density diversities of components are combined using the Bayes theorem. On the basis of the proposed kernel mixture model, we present a three-step Bayesian classifier, which includes partitioning, structure learning, and PDF estimation. Experimental results show that KMM is able to improve the quality of estimated PDFs of conventional kernel density estimation (KDE) method, and also show that KMM-based Bayesian classifiers outperforms existing Gaussian, GMM, and KDE-based Bayesian classifiers.
AB - Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures follows a given distribution, which may not be satisfied for real world data. This paper presents a non-parametric kernel mixture model (KMM) based probability density estimation approach, in which the data sample of a class is assumed to be drawn by several unknown independent hidden subclasses. Unlike traditional FMM schemes, we simply use the k-means clustering algorithm to partition the data sample into several independent components, and the regional density diversities of components are combined using the Bayes theorem. On the basis of the proposed kernel mixture model, we present a three-step Bayesian classifier, which includes partitioning, structure learning, and PDF estimation. Experimental results show that KMM is able to improve the quality of estimated PDFs of conventional kernel density estimation (KDE) method, and also show that KMM-based Bayesian classifiers outperforms existing Gaussian, GMM, and KDE-based Bayesian classifiers.
UR - http://www.scopus.com/inward/record.url?scp=85042069486&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85042069486&partnerID=8YFLogxK
U2 - 10.1007/s10618-018-0550-5
DO - 10.1007/s10618-018-0550-5
M3 - Article
AN - SCOPUS:85042069486
SN - 1384-5810
VL - 32
SP - 675
EP - 707
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 3
ER -