Measuring semantic relatedness using wikipedia revision information in a signed network

Wen Teng Yang, Hung-Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a valuable resource for research work. Wikipedia articles, written by a live collaboration of user editors, contain a high volume of reference links, URL identification for concepts and a complete revision history. Moreover, each Wikipedia article represents an individual concept that simultaneously contains other concepts that are hyperlinks of other articles embedded in its content. Through this, we believe that the semantic relatedness between two words can be found through the semantic relatedness between two Wikipedia articles. Therefore, we propose an Editor-Contribution-based Rank (ECR) algorithm for ranking the concepts in the article's content through all revisions and take the ranked concepts as a vector representing the article. We classify four types of relationship in which the behavior of addition and deletion maps appropriate and inappropriate concepts. ECR ranks those concepts depending on the mutual signed-reinforcement relationship between the concepts and the editors. The results reveal that our method leads to prominent performance improvement and increases the correlation coefficient by a factor ranging from 4% to 23% over previous methods that calculate the relatedness between two articles.

Original languageEnglish
Title of host publicationProceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011
Pages69-74
Number of pages6
DOIs
Publication statusPublished - 2011 Dec 1
Event16th Annual Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011 - Chung-Li, Taiwan
Duration: 2011 Nov 112011 Nov 13

Publication series

NameProceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011

Other

Other16th Annual Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011
CountryTaiwan
CityChung-Li
Period11-11-1111-11-13

Fingerprint

Semantics
Information retrieval
Websites
Reinforcement
Internet
Processing

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications

Cite this

Yang, W. T., & Kao, H-Y. (2011). Measuring semantic relatedness using wikipedia revision information in a signed network. In Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011 (pp. 69-74). [6120722] (Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011). https://doi.org/10.1109/TAAI.2011.20
Yang, Wen Teng ; Kao, Hung-Yu. / Measuring semantic relatedness using wikipedia revision information in a signed network. Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011. 2011. pp. 69-74 (Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011).
@inproceedings{5be3fdc19af749608678cb35ff146ac4,
title = "Measuring semantic relatedness using wikipedia revision information in a signed network",
abstract = "Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a valuable resource for research work. Wikipedia articles, written by a live collaboration of user editors, contain a high volume of reference links, URL identification for concepts and a complete revision history. Moreover, each Wikipedia article represents an individual concept that simultaneously contains other concepts that are hyperlinks of other articles embedded in its content. Through this, we believe that the semantic relatedness between two words can be found through the semantic relatedness between two Wikipedia articles. Therefore, we propose an Editor-Contribution-based Rank (ECR) algorithm for ranking the concepts in the article's content through all revisions and take the ranked concepts as a vector representing the article. We classify four types of relationship in which the behavior of addition and deletion maps appropriate and inappropriate concepts. ECR ranks those concepts depending on the mutual signed-reinforcement relationship between the concepts and the editors. The results reveal that our method leads to prominent performance improvement and increases the correlation coefficient by a factor ranging from 4{\%} to 23{\%} over previous methods that calculate the relatedness between two articles.",
author = "Yang, {Wen Teng} and Hung-Yu Kao",
year = "2011",
month = "12",
day = "1",
doi = "10.1109/TAAI.2011.20",
language = "English",
isbn = "9780769546018",
series = "Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011",
pages = "69--74",
booktitle = "Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011",

}

Yang, WT & Kao, H-Y 2011, Measuring semantic relatedness using wikipedia revision information in a signed network. in Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011., 6120722, Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011, pp. 69-74, 16th Annual Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011, Chung-Li, Taiwan, 11-11-11. https://doi.org/10.1109/TAAI.2011.20

Measuring semantic relatedness using wikipedia revision information in a signed network. / Yang, Wen Teng; Kao, Hung-Yu.

Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011. 2011. p. 69-74 6120722 (Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Measuring semantic relatedness using wikipedia revision information in a signed network

AU - Yang, Wen Teng

AU - Kao, Hung-Yu

PY - 2011/12/1

Y1 - 2011/12/1

N2 - Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a valuable resource for research work. Wikipedia articles, written by a live collaboration of user editors, contain a high volume of reference links, URL identification for concepts and a complete revision history. Moreover, each Wikipedia article represents an individual concept that simultaneously contains other concepts that are hyperlinks of other articles embedded in its content. Through this, we believe that the semantic relatedness between two words can be found through the semantic relatedness between two Wikipedia articles. Therefore, we propose an Editor-Contribution-based Rank (ECR) algorithm for ranking the concepts in the article's content through all revisions and take the ranked concepts as a vector representing the article. We classify four types of relationship in which the behavior of addition and deletion maps appropriate and inappropriate concepts. ECR ranks those concepts depending on the mutual signed-reinforcement relationship between the concepts and the editors. The results reveal that our method leads to prominent performance improvement and increases the correlation coefficient by a factor ranging from 4% to 23% over previous methods that calculate the relatedness between two articles.

AB - Identifying the semantic relatedness of two words is an important task for the information retrieval, natural language processing, and text mining. However, due to the diversity of meaning for a word, the semantic relatedness of two words is still hard to precisely evaluate under the limited corpora. Nowadays, Wikipedia is now a huge and wiki-based encyclopedia on the internet that has become a valuable resource for research work. Wikipedia articles, written by a live collaboration of user editors, contain a high volume of reference links, URL identification for concepts and a complete revision history. Moreover, each Wikipedia article represents an individual concept that simultaneously contains other concepts that are hyperlinks of other articles embedded in its content. Through this, we believe that the semantic relatedness between two words can be found through the semantic relatedness between two Wikipedia articles. Therefore, we propose an Editor-Contribution-based Rank (ECR) algorithm for ranking the concepts in the article's content through all revisions and take the ranked concepts as a vector representing the article. We classify four types of relationship in which the behavior of addition and deletion maps appropriate and inappropriate concepts. ECR ranks those concepts depending on the mutual signed-reinforcement relationship between the concepts and the editors. The results reveal that our method leads to prominent performance improvement and increases the correlation coefficient by a factor ranging from 4% to 23% over previous methods that calculate the relatedness between two articles.

UR - http://www.scopus.com/inward/record.url?scp=84862973145&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84862973145&partnerID=8YFLogxK

U2 - 10.1109/TAAI.2011.20

DO - 10.1109/TAAI.2011.20

M3 - Conference contribution

AN - SCOPUS:84862973145

SN - 9780769546018

T3 - Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011

SP - 69

EP - 74

BT - Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011

ER -

Yang WT, Kao H-Y. Measuring semantic relatedness using wikipedia revision information in a signed network. In Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011. 2011. p. 69-74. 6120722. (Proceedings - 2011 Conference on Technologies and Applications of Artificial Intelligence, TAAI 2011). https://doi.org/10.1109/TAAI.2011.20