Effective quality assurance for data labels through crowdsourcing and domain expert collaboration

Wei Lee, Chi Hsuan Huang, Chien Wei Chang, Ming Kuang Daniel Wu, Kun Ta Chuang, Po An Yang, Chu Cheng Hsieh

研究成果: Conference contribution

8 引文 斯高帕斯(Scopus)

摘要

Researchers and scientists have been using crowdsourcing platforms to collect labeled training data in recent years. The process is cost-effective and scalable, but research has shown that the quality of truth inference is unstable due to worker bias, work variance, and task difficulty. In this demonstration, we present a hybrid system, named IDLE (Integrated Data Labeling Engine), that brings together a well-trained troop of domain experts and the multitudes of a crowdsourcing platform to collect high-quality training data for industry-level classification engines. We show how to acquire high quality labeled data through quality control strategies that dynamically and cost-effectively leverage the strengths of both domain experts and crowdsourcing.

原文English
主出版物標題Advances in Database Technology - EDBT 2018
主出版物子標題21st International Conference on Extending Database Technology, Proceedings
編輯Michael Bohlen, Reinhard Pichler, Norman May, Erhard Rahm, Shan-Hung Wu, Katja Hose
發行者OpenProceedings.org
頁面646-649
頁數4
ISBN(電子)9783893180783
DOIs
出版狀態Published - 2018
事件21st International Conference on Extending Database Technology, EDBT 2018 - Vienna, Austria
持續時間: 2018 3月 262018 3月 29

出版系列

名字Advances in Database Technology - EDBT
2018-March
ISSN(電子)2367-2005

Conference

Conference21st International Conference on Extending Database Technology, EDBT 2018
國家/地區Austria
城市Vienna
期間18-03-2618-03-29

All Science Journal Classification (ASJC) codes

  • 資訊系統
  • 軟體
  • 電腦科學應用

指紋

深入研究「Effective quality assurance for data labels through crowdsourcing and domain expert collaboration」主題。共同形成了獨特的指紋。

引用此