Unsupervised corpus distillation for represented indicator measurement on focus species detection

Chih Hsuan Wei, Hung Yu Kao

研究成果: Article同行評審

1 引文 斯高帕斯(Scopus)

摘要

The gene ambiguity with the highest dimension is the species with which an entity is associated in biomedical text mining. Furthermore, one of the bottlenecks in gene normalisation is focus species detection. This study presents a method which is robust for all types of articles, particularly those without explicit species mentions. Since our method requires a training corpus, we developed an iterative distillation method to extend the corpus. Unsupervised corpus is therefore helpful for the detection of focus species. In experiments, the proposed method achieved a high accuracy of 85.64% and 84.32% in datasets with and without species mentions respectively.

原文English
頁(從 - 到)413-426
頁數14
期刊International Journal of Data Mining and Bioinformatics
8
發行號4
DOIs
出版狀態Published - 2013

All Science Journal Classification (ASJC) codes

  • 資訊系統
  • 一般生物化學,遺傳學和分子生物學
  • 圖書館與資訊科學

指紋

深入研究「Unsupervised corpus distillation for represented indicator measurement on focus species detection」主題。共同形成了獨特的指紋。

引用此