Entropy-based link analysis for mining web informative structures

Hung-Yu Kao, Shian Hua Lin, Jan Ming Ho, Ming Syan Chen

研究成果: Paper

24 引文 斯高帕斯(Scopus)

摘要

In this paper, we study the problem of mining the informative structure of a news Web site which consists of thousands of byperlinked documents. We define the informative structure of a news Web site as a set of index pages (or referred to as TOC. i.e., table of contents, pages) and a set of article pages linked by TOC pages through informative links. It is noted that the Hyperlink Induced Topics Search (HITS) algorithm has been employed to provide a solution to analyzing authorities and hubs of pages. However, most of the content sites tend to contain some extra hyperlinks, such as navigation panels, advertisements and banners, so as to increase the add-on values of their Web pages. Therefore, due to the structure induced by these extra hyperlinks, HITS is found to be insufficient to provide a good precision in solving the problem. To remedy this, we develop an algorithm to utilize entropy-based Link Analysis on Mining Web Informative Structures. This algorithm is referred to as LAM1S. The key idea of LAMIS is to utilize information entropy for representing the knowledge that corresponds to the amount of information in a link or a page in the link analysis. Experiments on several real news Web sites show that the precision and the recall of LAMIS are much superior to those obtained by heuristic methods and conventional ink analysis methods.

原文English
頁面574-581
頁數8
出版狀態Published - 2002 十二月 1
事件Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002) - McLean, VA, United States
持續時間: 2002 十一月 42002 十一月 9

Other

OtherProceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002)
國家United States
城市McLean, VA
期間02-11-0402-11-09

    指紋

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

引用此

Kao, H-Y., Lin, S. H., Ho, J. M., & Chen, M. S. (2002). Entropy-based link analysis for mining web informative structures. 574-581. 論文發表於 Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002), McLean, VA, United States.