Opportunities or risks to reduce labor in crowdsourcing translation? Characterizing cost versus quality via a PageRank-HITS hybrid model

Rui Yan, Yiping Song, Cheng Te Li, Ming Zhang, Xiaohua Hu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Citations (Scopus)

Abstract

Crowdsourcing machine translation shows advantages of lower expense in money to collect the translated data. Yet, when compared with translation by trained professionals, results collected from non-professional translators might yield lowquality outputs. A general solution for crowdsourcing practitioners is to employ a large amount of labor force to gather enough redundant data and then solicit from it. Actually we can further save money by avoid collecting bad translations. We propose to score Turkers by their authorities during observation, and then stop hiring the unqualified Turkers. In this way, we bring both opportunities and risks in crowdsourced translation: we can make it cheaper than cheaper while we might suffer from quality loss. In this paper, we propose a graph-based PageRank-HITS Hybrid model to distinguish authoritative workers from unreliable ones. The algorithm captures the intuition that good translation and good workers are mutually reinforced iteratively in the proposed frame. We demonstrate the algorithm will keep the performance while reduce work force and hence cut cost. We run experiments on the NIST 2009 Urdu-to-English evaluation set with Mechanical Turk, and quantitatively evaluate the performance in terms of BLEU score, Pearson correlation and real money.

Original languageEnglish
Title of host publicationIJCAI 2015 - Proceedings of the 24th International Joint Conference on Artificial Intelligence
EditorsMichael Wooldridge, Qiang Yang
PublisherInternational Joint Conferences on Artificial Intelligence
Pages1025-1032
Number of pages8
ISBN (Electronic)9781577357384
Publication statusPublished - 2015 Jan 1
Event24th International Joint Conference on Artificial Intelligence, IJCAI 2015 - Buenos Aires, Argentina
Duration: 2015 Jul 252015 Jul 31

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
Volume2015-January
ISSN (Print)1045-0823

Other

Other24th International Joint Conference on Artificial Intelligence, IJCAI 2015
CountryArgentina
CityBuenos Aires
Period15-07-2515-07-31

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Cite this

Yan, R., Song, Y., Li, C. T., Zhang, M., & Hu, X. (2015). Opportunities or risks to reduce labor in crowdsourcing translation? Characterizing cost versus quality via a PageRank-HITS hybrid model. In M. Wooldridge, & Q. Yang (Eds.), IJCAI 2015 - Proceedings of the 24th International Joint Conference on Artificial Intelligence (pp. 1025-1032). (IJCAI International Joint Conference on Artificial Intelligence; Vol. 2015-January). International Joint Conferences on Artificial Intelligence.