Toward web mining of cross-language query translations in digital libraries

Jenq Haur Wang, Wen-Hsiang Lu, Lee Feng Chien

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of crosslingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.

Original languageEnglish
Pages (from-to)247-257
Number of pages11
JournalInternational Journal on Digital Libraries
Volume4
Issue number4
DOIs
Publication statusPublished - 2004 Dec 1

Fingerprint

information retrieval
dictionary
language
resources
technical language

All Science Journal Classification (ASJC) codes

  • Library and Information Sciences

Cite this

@article{f784dc5087c6495e9291a1ed1421a05a,
title = "Toward web mining of cross-language query translations in digital libraries",
abstract = "This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of crosslingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.",
author = "Wang, {Jenq Haur} and Wen-Hsiang Lu and Chien, {Lee Feng}",
year = "2004",
month = "12",
day = "1",
doi = "10.1007/s00799-004-0091-y",
language = "English",
volume = "4",
pages = "247--257",
journal = "International Journal on Digital Libraries",
issn = "1432-5012",
publisher = "Springer Verlag",
number = "4",

}

Toward web mining of cross-language query translations in digital libraries. / Wang, Jenq Haur; Lu, Wen-Hsiang; Chien, Lee Feng.

In: International Journal on Digital Libraries, Vol. 4, No. 4, 01.12.2004, p. 247-257.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Toward web mining of cross-language query translations in digital libraries

AU - Wang, Jenq Haur

AU - Lu, Wen-Hsiang

AU - Chien, Lee Feng

PY - 2004/12/1

Y1 - 2004/12/1

N2 - This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of crosslingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.

AB - This paper proposes an effective query-translation approach that enables a cross-language information retrieval (CLIR) service to be more easily supported in digital library systems that only contain monolingual content. A query-translation engine called LiveTrans is used to process the translation requests of crosslingual queries from connected digital library systems. To automatically extract translations not covered by standard dictionaries, the engine is developed based on a novel integration of dictionary resources and Web mining approaches, including anchor-text and search-result methods. The engine exploits a broad range of multilingual Web resources used as live bilingual corpora to alleviate translation difficulties. It is shown to be particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminology. The obtained results show the feasibility of and great potential for creating English-Chinese CLIR services in existing digital libraries and new applications in cross-language Web searching, although difficulties still remain that need to be investigated further.

UR - http://www.scopus.com/inward/record.url?scp=44849140591&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=44849140591&partnerID=8YFLogxK

U2 - 10.1007/s00799-004-0091-y

DO - 10.1007/s00799-004-0091-y

M3 - Article

VL - 4

SP - 247

EP - 257

JO - International Journal on Digital Libraries

JF - International Journal on Digital Libraries

SN - 1432-5012

IS - 4

ER -