TY - GEN
T1 - Automatic ontology population using deep learning for triple extraction
AU - Su, Ming Hsiang
AU - Wu, Chung Hsien
AU - Shih, Po Chen
PY - 2019/11
Y1 - 2019/11
N2 - Ontology is a kind of representation used to represent knowledge in a form that computers can derive the content meaning. The purpose of this work is to automatically populate an ontology using deep neural networks for updating an ontology with new facts from an input knowledge resource. In this study for automatic ontology population, a bi-LSTM-based term extraction model based on character embedding is proposed to extract the terms from a sentence. The extracted terms are regarded as the concepts of the ontology. Then, a multi-layer perception network is employed to decide the predicates between the pairs of the extracted concepts. The two concepts (one serves as subject and the other as object) along with the predicate form a triple. The number of occurrences of the dependency relations between the concepts and the predicates are estimated. The predicates with low occurrence frequency are filtered out to obtain precise triples for ontology population. For evaluation of the proposed method, we collected 46, 646 sentences from Ontonotes 5.0 for training and testing the bi-LSTM-based term extraction model. We also collected 404, 951 triples from ConceptNet 5 for training and testing the multilayer perceptron-based triple extraction model. From the experimental results, the proposed method could extract the triples from the documents, achieving 74.59% accuracy for ontology population.
AB - Ontology is a kind of representation used to represent knowledge in a form that computers can derive the content meaning. The purpose of this work is to automatically populate an ontology using deep neural networks for updating an ontology with new facts from an input knowledge resource. In this study for automatic ontology population, a bi-LSTM-based term extraction model based on character embedding is proposed to extract the terms from a sentence. The extracted terms are regarded as the concepts of the ontology. Then, a multi-layer perception network is employed to decide the predicates between the pairs of the extracted concepts. The two concepts (one serves as subject and the other as object) along with the predicate form a triple. The number of occurrences of the dependency relations between the concepts and the predicates are estimated. The predicates with low occurrence frequency are filtered out to obtain precise triples for ontology population. For evaluation of the proposed method, we collected 46, 646 sentences from Ontonotes 5.0 for training and testing the bi-LSTM-based term extraction model. We also collected 404, 951 triples from ConceptNet 5 for training and testing the multilayer perceptron-based triple extraction model. From the experimental results, the proposed method could extract the triples from the documents, achieving 74.59% accuracy for ontology population.
UR - http://www.scopus.com/inward/record.url?scp=85082381633&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082381633&partnerID=8YFLogxK
U2 - 10.1109/APSIPAASC47483.2019.9023113
DO - 10.1109/APSIPAASC47483.2019.9023113
M3 - Conference contribution
T3 - 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
SP - 262
EP - 267
BT - 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2019
Y2 - 18 November 2019 through 21 November 2019
ER -