Represented indicator measurement and corpus distillation on focus species detection

Chih Hsuan Wei, Hung Yu Kao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In extraction of information from the biomedical literature, name disambiguation of domain-specific entities, such as proteins, is one of the most important issues. The entity ambiguity with the highest dimension is the species to which an entity is associated with. Furthermore, one of the bottlenecks in inter-species gene name normalization is species disambiguation. To enhance the performance of species disambiguation, the detection of focus species detection remains a substantial challenge. This study presents a method addressing this issue. The results present evaluations of all articles from the BioCreaTive I&II GN task. Our method is robust for all types of articles, particularly those without explicit species entity information. Since our method requires a training corpus to be the indicator vector, we developed an iterative corpus distillation method to extend the corpus. In the conducted experiments, the proposed method achieved a high accuracy of 85.64% and 84.32% without species entity information.

Original languageEnglish
Title of host publicationProceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
Pages657-662
Number of pages6
DOIs
Publication statusPublished - 2010 Dec 1
Event2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010 - Hong Kong, China
Duration: 2010 Dec 182010 Dec 21

Publication series

NameProceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010

Other

Other2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
CountryChina
CityHong Kong
Period10-12-1810-12-21

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics

Fingerprint Dive into the research topics of 'Represented indicator measurement and corpus distillation on focus species detection'. Together they form a unique fingerprint.

Cite this