Automatic domain-specific sentiment lexicon generation with label propagation

Yen Jen Tai, Hung Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

18 Citations (Scopus)

Abstract

Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.

Original languageEnglish
Title of host publicationProceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013
Pages53-62
Number of pages10
DOIs
Publication statusPublished - 2013 Dec 1
Event15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013 - Vienna, Austria
Duration: 2013 Dec 22013 Dec 4

Publication series

NameACM International Conference Proceeding Series

Other

Other15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013
Country/TerritoryAustria
CityVienna
Period13-12-0213-12-04

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Automatic domain-specific sentiment lexicon generation with label propagation'. Together they form a unique fingerprint.

Cite this