Solving multi-label text categorization problem using support vector machine approach with membership function

Tai Yue Wang, Huei Min Chiang

研究成果: Article

18 引文 斯高帕斯(Scopus)

摘要

The pervasiveness of information available on the internet means that increasing numbers of documents must be classified. Text categorization is not only undertaken by domain experts, but also by automatic text categorization systems. Therefore, a text categorization system with a multi-label classifier is necessary to process the large number of documents.In this study, a proposed multi-label text categorization system is developed to classify multi-label documents. Data mapping is performed to transform data from a high-dimensional space to a lower-dimensional space with paired SVM output values, thus lowering the complexity of the computation. A pairwise comparison approach is applied to set the membership function in each predicted class to judge all possible classified classes. To better explain the proposed model, a comparative study using Reuter's data sets is performed on several multi-label approaches such as Naive Bayes, Multi-Label Mixture, Jaccard Kernel and Bp-MLL. Though the comparative results of the empirical experiment indicate that the proposed multi-label text categorization system performs better than other methods in terms of overall performance indices, these comparisons are done under the conditions without knowing original settings of parameters. From these comparative studies, it is found that these probabilities of documents appearing in correctly predicted classes and those of documents appearing in the wrongly predicted classes are important properties and we conclude that the probability of 0.5 for model membership function is a good criterion to judge between correctly and incorrectly classified documents from the results of the empirical experiment.

原文English
頁(從 - 到)3682-3689
頁數8
期刊Neurocomputing
74
發行號17
DOIs
出版狀態Published - 2011 十月 1

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Cognitive Neuroscience
  • Artificial Intelligence

指紋 深入研究「Solving multi-label text categorization problem using support vector machine approach with membership function」主題。共同形成了獨特的指紋。

  • 引用此