Abstract
Propositional terms in a research abstract (RA) generally convey the most important information for readers to quickly glean the contribution of a research article. This paper considers propositional term extraction from RAs as a sequence labeling task using the IOB (Inside, Outside, Beginning) encoding scheme. In this study, conditional random fields (CRFs) are used to initially detect the propositional terms, and the combined association measure (CAM) is applied to further adjust the term boundaries. This method can extract beyond simply NP-based propositional terms by combining multi-level features and inner lexical cohesion. Experimental results show that CRFs can significantly increase the recall rate of imperfect boundary term extraction and the CAM can further effectively improve the term boundaries.
Original language | English |
---|---|
Pages | 151-165 |
Number of pages | 15 |
Publication status | Published - 2008 |
Event | 20th Conference on Computational Linguistics and Speech Processing, ROCLING 2008 - Taipei, Taiwan Duration: 2008 Sept 4 → 2008 Sept 5 |
Other
Other | 20th Conference on Computational Linguistics and Speech Processing, ROCLING 2008 |
---|---|
Country/Territory | Taiwan |
City | Taipei |
Period | 08-09-04 → 08-09-05 |
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Speech and Hearing