Automatic domain-specific sentiment lexicon generation with label propagation

Yen Jen Tai, Hung-Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.

Original languageEnglish
Title of host publicationProceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013
Pages53-62
Number of pages10
DOIs
Publication statusPublished - 2013 Dec 1
Event15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013 - Vienna, Austria
Duration: 2013 Dec 22013 Dec 4

Publication series

NameACM International Conference Proceeding Series

Other

Other15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013
CountryAustria
CityVienna
Period13-12-0213-12-04

Fingerprint

Glossaries
Learning systems
Labels
Semantics
Experiments

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Cite this

Tai, Y. J., & Kao, H-Y. (2013). Automatic domain-specific sentiment lexicon generation with label propagation. In Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013 (pp. 53-62). (ACM International Conference Proceeding Series). https://doi.org/10.1145/2539150.2539190
Tai, Yen Jen ; Kao, Hung-Yu. / Automatic domain-specific sentiment lexicon generation with label propagation. Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013. 2013. pp. 53-62 (ACM International Conference Proceeding Series).
@inproceedings{bfe7023d1ae44344bf56db5e2578c5d7,
title = "Automatic domain-specific sentiment lexicon generation with label propagation",
abstract = "Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.",
author = "Tai, {Yen Jen} and Hung-Yu Kao",
year = "2013",
month = "12",
day = "1",
doi = "10.1145/2539150.2539190",
language = "English",
isbn = "9781450321136",
series = "ACM International Conference Proceeding Series",
pages = "53--62",
booktitle = "Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013",

}

Tai, YJ & Kao, H-Y 2013, Automatic domain-specific sentiment lexicon generation with label propagation. in Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013. ACM International Conference Proceeding Series, pp. 53-62, 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013, Vienna, Austria, 13-12-02. https://doi.org/10.1145/2539150.2539190

Automatic domain-specific sentiment lexicon generation with label propagation. / Tai, Yen Jen; Kao, Hung-Yu.

Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013. 2013. p. 53-62 (ACM International Conference Proceeding Series).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Automatic domain-specific sentiment lexicon generation with label propagation

AU - Tai, Yen Jen

AU - Kao, Hung-Yu

PY - 2013/12/1

Y1 - 2013/12/1

N2 - Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.

AB - Nowadays, the advance of social media has led to the explosive growth of opinion data. Therefore, sentiment analysis has attracted a lot of attentions. Currently, sentiment analysis applications are divided into two main approaches, the lexicon-based approach and the machine-learning approach. However, both of them face the challenge of obtaining a large amount of human-labeled training data and corpus. For the lexicon-based approach, it requires a sentiment lexicon to determine the opinion polarity. There are many existing benchmark sentiment lexicons, but they cannot cover all the domain-specific words meanings. Thus, automatic generation of a domain-specific sentiment lexicon becomes an important task. We propose a framework to automatically generate sentiment lexicon. First, we determine the semantic similarity between two words in the entire unlabeled corpus. We treat the words as nodes and similarities as weighted edges to construct word graphs. A graph-based semi-supervised label propagation method finally assigns the polarity to unlabeled words through the proposed propagation process. Experiments conducted on the microblog data, Twitter, show that our approach leads to a better performance than baseline approaches and general-purpose sentiment dictionaries.

UR - http://www.scopus.com/inward/record.url?scp=84896814545&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84896814545&partnerID=8YFLogxK

U2 - 10.1145/2539150.2539190

DO - 10.1145/2539150.2539190

M3 - Conference contribution

SN - 9781450321136

T3 - ACM International Conference Proceeding Series

SP - 53

EP - 62

BT - Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013

ER -

Tai YJ, Kao H-Y. Automatic domain-specific sentiment lexicon generation with label propagation. In Proceedings - 15th International Conference on Information Integration and Web-Based Applications and Services, iiWAS 2013. 2013. p. 53-62. (ACM International Conference Proceeding Series). https://doi.org/10.1145/2539150.2539190