Top stories identification from blog to news in TREC 2010 blog track

Yu Fan Lin, Jing Hau Wang, Liang Cheng Lai, Hung-Yu Kao

Research output: Contribution to journalArticle

Abstract

In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Ranking Task is ranking the blog posts that are relevant to the news story and diversifying the topics of blog posts. In Story Ranking Task, our team Ikm100 (NCKU CSIE IKMLAB) submitted three runs. In the first run, a news story is scored by its number of discussion posts. In the second run, our idea is that if the news story is discussed by more people and the supporting blog post is relatively important, the news story would be more important. In the last run, we use the "Relevant-Post Time-Entropy evaluation" to score the news story. In News Blog Post Ranking Task, we use the cosine similarity between the news story and the blog post, and also use importance of posts to extract the supporting blog posts of the news query.

Original languageEnglish
JournalNIST Special Publication
Publication statusPublished - 2010

Fingerprint

Blogs
Distillation
Entropy

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

@article{f7198120b62c4926b700ba41032958ab,
title = "Top stories identification from blog to news in TREC 2010 blog track",
abstract = "In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Ranking Task is ranking the blog posts that are relevant to the news story and diversifying the topics of blog posts. In Story Ranking Task, our team Ikm100 (NCKU CSIE IKMLAB) submitted three runs. In the first run, a news story is scored by its number of discussion posts. In the second run, our idea is that if the news story is discussed by more people and the supporting blog post is relatively important, the news story would be more important. In the last run, we use the {"}Relevant-Post Time-Entropy evaluation{"} to score the news story. In News Blog Post Ranking Task, we use the cosine similarity between the news story and the blog post, and also use importance of posts to extract the supporting blog posts of the news query.",
author = "Lin, {Yu Fan} and Wang, {Jing Hau} and Lai, {Liang Cheng} and Hung-Yu Kao",
year = "2010",
language = "English",
journal = "NIST Special Publication",
issn = "1048-776X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

Top stories identification from blog to news in TREC 2010 blog track. / Lin, Yu Fan; Wang, Jing Hau; Lai, Liang Cheng; Kao, Hung-Yu.

In: NIST Special Publication, 2010.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Top stories identification from blog to news in TREC 2010 blog track

AU - Lin, Yu Fan

AU - Wang, Jing Hau

AU - Lai, Liang Cheng

AU - Kao, Hung-Yu

PY - 2010

Y1 - 2010

N2 - In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Ranking Task is ranking the blog posts that are relevant to the news story and diversifying the topics of blog posts. In Story Ranking Task, our team Ikm100 (NCKU CSIE IKMLAB) submitted three runs. In the first run, a news story is scored by its number of discussion posts. In the second run, our idea is that if the news story is discussed by more people and the supporting blog post is relatively important, the news story would be more important. In the last run, we use the "Relevant-Post Time-Entropy evaluation" to score the news story. In News Blog Post Ranking Task, we use the cosine similarity between the news story and the blog post, and also use importance of posts to extract the supporting blog posts of the news query.

AB - In 2010 Blog Track, there are two tasks including Faceted Blog Distillation Task and Top Stories Identification Task. We mainly focus on the Top Stories Identification Task. In this task, there are two issues to solve. The first issue is ranking the important news stories on the specified day, named Story Ranking Task. The second issue is named News Blog Post Ranking Task. News Blog Post Ranking Task is ranking the blog posts that are relevant to the news story and diversifying the topics of blog posts. In Story Ranking Task, our team Ikm100 (NCKU CSIE IKMLAB) submitted three runs. In the first run, a news story is scored by its number of discussion posts. In the second run, our idea is that if the news story is discussed by more people and the supporting blog post is relatively important, the news story would be more important. In the last run, we use the "Relevant-Post Time-Entropy evaluation" to score the news story. In News Blog Post Ranking Task, we use the cosine similarity between the news story and the blog post, and also use importance of posts to extract the supporting blog posts of the news query.

UR - http://www.scopus.com/inward/record.url?scp=84873435313&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84873435313&partnerID=8YFLogxK

M3 - Article

JO - NIST Special Publication

JF - NIST Special Publication

SN - 1048-776X

ER -