The generalized Dirichlet distribution has been shown to be a more appropriate prior for nave Bayesian classifiers, because it can release both the negative-correlation and the equal-confidence requirements of the Dirichlet distribution. The previous research did not take the impact of individual attributes on classification accuracy into account, and therefore assumed that all attributes follow the same generalized Dirichlet prior. In this study, the selective nave Bayes mechanism is employed to choose and rank attributes, and two methods are then proposed to search for the best prior of each single attribute according to the attribute ranks. The experimental results on 18 data sets show that the best approach is to use selective nave Bayes for filtering and ranking attributes when all of them have Dirichlet priors with Laplace's estimate. After the ranks of the chosen attributes are determined, individual setting is performed to search for the best noninformative generalized Dirichlet prior for each attribute. The selective nave Bayes is also compared with two representative filters for the feature selection, and the experimental results show that it has the best performance.
All Science Journal Classification (ASJC) codes
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence