AutoBind: Automatic extraction of protein-ligand-binding affinity data from biological literature

Darby Tien Hao Chang, Chao Hsuan Ke, Jung Hsin Lin, Jung Hsien Chiang

研究成果: Article同行評審

9 引文 斯高帕斯(Scopus)

摘要

Motivation: Determination of the binding affinity of a protein-ligand complex is important to quantitatively specify whether a particular small molecule will bind to the target protein. Besides, collection of comprehensive datasets for protein-ligand complexes and their corresponding binding affinities is crucial in developing accurate scoring functions for the prediction of the binding affinities of previously unknown protein-ligand complexes. In the past decades, several databases of protein-ligand-binding affinities have been created via visual extraction from literature. However, such approaches are time-consuming and most of these databases are updated only a few times per year. Hence, there is an immediate demand for an automatic extraction method with high precision for binding affinity collection.Result: We have created a new database of protein-ligand-binding affinity data, AutoBind, based on automatic information retrieval. We first compiled a collection of 1586 articles where the binding affinities have been marked manually. Based on this annotated collection, we designed four sentence patterns that are used to scan full-text articles as well as a scoring function to rank the sentences that match our patterns. The proposed sentence patterns can effectively identify the binding affinities in full-text articles. Our assessment shows that AutoBind achieved 84.22% precision and 79.07% recall on the testing corpus. Currently, 13 616 protein-ligand complexes and the corresponding binding affinities have been deposited in AutoBind from 17 221 articles.

原文English
文章編號bts367
頁(從 - 到)2162-2168
頁數7
期刊Bioinformatics
28
發行號16
DOIs
出版狀態Published - 2012 8月

All Science Journal Classification (ASJC) codes

  • 統計與概率
  • 生物化學
  • 分子生物學
  • 電腦科學應用
  • 計算機理論與數學
  • 計算數學

指紋

深入研究「AutoBind: Automatic extraction of protein-ligand-binding affinity data from biological literature」主題。共同形成了獨特的指紋。

引用此