Document classification, with the blooming of the Internet information delivery, has become indispensable required and is expected to be disposed by an automatic text categorization. This paper presents a text categorization system to solve the multi-class categorization problem. The system consists of two modules: the processing module and the classifying module. In the first module, ICF and Uni are used as the indictors to extract the relevant terms. While the fuzzy set theory is incorporated into the OAA-SVM in the classifying module, we specifically propose an OAA-FSVM classifier to implement a multi-class classification system. The performances of OAA-SVM and OAA-FSVM are evaluated by macro-average performance index. Also the statistical significance test is examined by the McNemar's test. The results from the empirical study show that the proposed OAA-FSVM method has out-performed OAA-SVM in the multi-class text categorization problem.
All Science Journal Classification (ASJC) codes
- Information Systems
- Media Technology
- Computer Science Applications
- Management Science and Operations Research
- Library and Information Sciences