Exploiting the web as the multilingual corpus for unknown query translation

Jenq Haur Wang, Jei Wen Teng, Wen Hsiang Lu, Lee Feng Chien

研究成果: Article同行評審

19 引文 斯高帕斯(Scopus)

摘要

Users' cross-lingual queries to a digital library system might be short and the query terms may not be included in a common translation dictionary (unknown terms). In this article, the authors investigate the feasibility of exploiting the Web as the multilingual corpus source to translate unknown query terms for cross-language information retrieval in digital libraries. They propose a Web-based term translation approach to determine effective translations for unknown query terms by mining bilingual search-result pages obtained from a real Web search engine. This approach can enhance the construction of a domain-specific bilingual lexicon and bring multilingual support to a digital library that only has monolingual document collections. Very promising results have been obtained in generating effective translation equivalents for many unknown terms, including proper nouns, technical terms, and Web query terms, and in assisting bilingual lexicon construction for a real digital library system.

原文English
頁(從 - 到)660-670
頁數11
期刊Journal of the American Society for Information Science and Technology
57
發行號5
DOIs
出版狀態Published - 2006 3月

All Science Journal Classification (ASJC) codes

  • 軟體
  • 資訊系統
  • 人機介面
  • 電腦網路與通信
  • 人工智慧

指紋

深入研究「Exploiting the web as the multilingual corpus for unknown query translation」主題。共同形成了獨特的指紋。

引用此