FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors

Jung-Yi Jiang, Shian Chi Tsai, Shie Jue Lee

Research output: Contribution to journalArticle

51 Citations (Scopus)

Abstract

We propose an efficient approach, FSKNN, which employs fuzzy similarity measure (FSM) and k nearest neighbors (KNN), for multi-label text classification. One of the problems associated with KNN-like approaches is its demanding computational cost in finding the k nearest neighbors from all the training patterns. For FSKNN, FSM is used to group the training patterns into clusters. Then only the training documents in those clusters whose fuzzy similarities to the document exceed a predesignated threshold are considered in finding the k nearest neighbors for the document. An unseen document is labeled based on its k nearest neighbors using the maximum a posteriori estimate. Experimental results show that our proposed method can work more effectively than other methods.

Original languageEnglish
Pages (from-to)2813-2821
Number of pages9
JournalExpert Systems With Applications
Volume39
Issue number3
DOIs
Publication statusPublished - 2012 Feb 15

Fingerprint

Labels
Costs

All Science Journal Classification (ASJC) codes

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this

@article{6a9b47b2f7ec4f65af21d35d54b6bea9,
title = "FSKNN: Multi-label text categorization based on fuzzy similarity and k nearest neighbors",
abstract = "We propose an efficient approach, FSKNN, which employs fuzzy similarity measure (FSM) and k nearest neighbors (KNN), for multi-label text classification. One of the problems associated with KNN-like approaches is its demanding computational cost in finding the k nearest neighbors from all the training patterns. For FSKNN, FSM is used to group the training patterns into clusters. Then only the training documents in those clusters whose fuzzy similarities to the document exceed a predesignated threshold are considered in finding the k nearest neighbors for the document. An unseen document is labeled based on its k nearest neighbors using the maximum a posteriori estimate. Experimental results show that our proposed method can work more effectively than other methods.",
author = "Jung-Yi Jiang and Tsai, {Shian Chi} and Lee, {Shie Jue}",
year = "2012",
month = "2",
day = "15",
doi = "10.1016/j.eswa.2011.08.141",
language = "English",
volume = "39",
pages = "2813--2821",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",
number = "3",

}

FSKNN : Multi-label text categorization based on fuzzy similarity and k nearest neighbors. / Jiang, Jung-Yi; Tsai, Shian Chi; Lee, Shie Jue.

In: Expert Systems With Applications, Vol. 39, No. 3, 15.02.2012, p. 2813-2821.

Research output: Contribution to journalArticle

TY - JOUR

T1 - FSKNN

T2 - Multi-label text categorization based on fuzzy similarity and k nearest neighbors

AU - Jiang, Jung-Yi

AU - Tsai, Shian Chi

AU - Lee, Shie Jue

PY - 2012/2/15

Y1 - 2012/2/15

N2 - We propose an efficient approach, FSKNN, which employs fuzzy similarity measure (FSM) and k nearest neighbors (KNN), for multi-label text classification. One of the problems associated with KNN-like approaches is its demanding computational cost in finding the k nearest neighbors from all the training patterns. For FSKNN, FSM is used to group the training patterns into clusters. Then only the training documents in those clusters whose fuzzy similarities to the document exceed a predesignated threshold are considered in finding the k nearest neighbors for the document. An unseen document is labeled based on its k nearest neighbors using the maximum a posteriori estimate. Experimental results show that our proposed method can work more effectively than other methods.

AB - We propose an efficient approach, FSKNN, which employs fuzzy similarity measure (FSM) and k nearest neighbors (KNN), for multi-label text classification. One of the problems associated with KNN-like approaches is its demanding computational cost in finding the k nearest neighbors from all the training patterns. For FSKNN, FSM is used to group the training patterns into clusters. Then only the training documents in those clusters whose fuzzy similarities to the document exceed a predesignated threshold are considered in finding the k nearest neighbors for the document. An unseen document is labeled based on its k nearest neighbors using the maximum a posteriori estimate. Experimental results show that our proposed method can work more effectively than other methods.

UR - http://www.scopus.com/inward/record.url?scp=80255123384&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80255123384&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2011.08.141

DO - 10.1016/j.eswa.2011.08.141

M3 - Article

AN - SCOPUS:80255123384

VL - 39

SP - 2813

EP - 2821

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

IS - 3

ER -