Translating unknown queries with web corpora for cross-language information retrieval

Pu Jen Cheng, Jei Wen Teng, Ruei Cheng Chen, Jenq Haur Wang, Wen Hsiang Lu, Lee Feng Chien

Research output: Chapter in Book/Report/Conference proceedingConference contribution

63 Citations (Scopus)

Abstract

It is crucial for cross-language information retrieval (CLIR) systems to deal with the translation of unknown queries1 due to that real queries might be short. The purpose of this paper is to investigate the feasibility of exploiting the Web as the corpus source to translate unknown queries for CLIR. We propose an online translation approach to determine effective translations for unknown query terms via mining of bilingual search-result pages obtained from Web search engines. This approach can alleviate the problem of the lack of large bilingual corpora, translate many unknown query terms, provide flexible query specifications, and extract semantically-close translations to benefit CLIR tasks- especially for cross-language Web search.

Original languageEnglish
Title of host publicationProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
EditorsK. Jarvelin, J. Allen, P. Bruza, M. Sanderson
Pages146-153
Number of pages8
Publication statusPublished - 2004 Nov 25
EventProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval - Sheffield, United Kingdom
Duration: 2004 Jul 252004 Jul 29

Publication series

NameProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

Other

OtherProceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
CountryUnited Kingdom
CitySheffield
Period04-07-2504-07-29

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Engineering(all)

Cite this

Cheng, P. J., Teng, J. W., Chen, R. C., Wang, J. H., Lu, W. H., & Chien, L. F. (2004). Translating unknown queries with web corpora for cross-language information retrieval. In K. Jarvelin, J. Allen, P. Bruza, & M. Sanderson (Eds.), Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 146-153). (Proceedings of Sheffield SIGIR - Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval).