Top stories identification from blog to news in TREC 2010 blog track

Yu Fan Lin, Jing Hau Wang, Liang Cheng Lai, Hung Yu Kao

Research output: Contribution to journalConference article

Abstract

In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Ranking Task is ranking the blog posts that are relevant to the news story and diversifying the topics of blog posts. In Story Ranking Task, our team Ikm100 (NCKU CSIE IKMLAB) submitted three runs. In the first run, a news story is scored by its number of discussion posts. In the second run, our idea is that if the news story is discussed by more people and the supporting blog post is relatively important, the news story would be more important. In the last run, we use the "Relevant-Post Time-Entropy evaluation" to score the news story. In News Blog Post Ranking Task, we use the cosine similarity between the news story and the blog post, and also use importance of posts to extract the supporting blog posts of the news query.

Original languageEnglish
JournalNIST Special Publication
Publication statusPublished - 2010 Dec 1
Event19th Text REtrieval Conference, TREC 2010 - Gaithersburg, MD, United States
Duration: 2010 Nov 162010 Nov 19

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Fingerprint Dive into the research topics of 'Top stories identification from blog to news in TREC 2010 blog track'. Together they form a unique fingerprint.

Cite this