TY - GEN
T1 - Represented indicator measurement and corpus distillation on focus species detection
AU - Wei, Chih Hsuan
AU - Kao, Hung Yu
PY - 2010/12/1
Y1 - 2010/12/1
N2 - In extraction of information from the biomedical literature, name disambiguation of domain-specific entities, such as proteins, is one of the most important issues. The entity ambiguity with the highest dimension is the species to which an entity is associated with. Furthermore, one of the bottlenecks in inter-species gene name normalization is species disambiguation. To enhance the performance of species disambiguation, the detection of focus species detection remains a substantial challenge. This study presents a method addressing this issue. The results present evaluations of all articles from the BioCreaTive I&II GN task. Our method is robust for all types of articles, particularly those without explicit species entity information. Since our method requires a training corpus to be the indicator vector, we developed an iterative corpus distillation method to extend the corpus. In the conducted experiments, the proposed method achieved a high accuracy of 85.64% and 84.32% without species entity information.
AB - In extraction of information from the biomedical literature, name disambiguation of domain-specific entities, such as proteins, is one of the most important issues. The entity ambiguity with the highest dimension is the species to which an entity is associated with. Furthermore, one of the bottlenecks in inter-species gene name normalization is species disambiguation. To enhance the performance of species disambiguation, the detection of focus species detection remains a substantial challenge. This study presents a method addressing this issue. The results present evaluations of all articles from the BioCreaTive I&II GN task. Our method is robust for all types of articles, particularly those without explicit species entity information. Since our method requires a training corpus to be the indicator vector, we developed an iterative corpus distillation method to extend the corpus. In the conducted experiments, the proposed method achieved a high accuracy of 85.64% and 84.32% without species entity information.
UR - http://www.scopus.com/inward/record.url?scp=79952411027&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952411027&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2010.5706647
DO - 10.1109/BIBM.2010.5706647
M3 - Conference contribution
AN - SCOPUS:79952411027
SN - 9781424483075
T3 - Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
SP - 657
EP - 662
BT - Proceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
T2 - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
Y2 - 18 December 2010 through 21 December 2010
ER -