Sentence decomplexification using holistic aspect-based clause detection for long sentence understanding

Chao Hong Liu, Chung-Hsien Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Long sentences have posed significant challenges for many natural language processing (NLP) tasks such as machine translation and language understanding, because it is still very difficult for the state-of-the-art parsers to analyze them. In this paper, we identify the Sentence Decomplexification (SD) problem and propose models for SD to help understand long sentences. Given a complex sentence, SD seeks to return two sentences, one main clause and the other subordinate clause. These two clauses together include all the information of the original sentence. Since identifying subordinate clauses is a more difficult task than traditional chunking, we also propose a holistic aspect-based detection (HAD) method for clause detection to reduce the overhead required for SD sentence similarity computation. We provide the formalisms of SD and show that HAD can be used for efficiency purposes to this task. The SD system was used to improve the performance of a long sentence understanding system. Experimental results show that the task of SD achieves 78.7% accuracy using Chinese Gigaword Corpus as sentence comparison corpus. For the performance of long sentence understanding, the proposed method reports an improvement of accuracy from 70.7% to 75.5% as compared to that without using SD.

Original languageEnglish
Title of host publication2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Proceedings
Pages265-270
Number of pages6
DOIs
Publication statusPublished - 2010 Dec 1
Event2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Tainan, Taiwan
Duration: 2010 Nov 292010 Dec 3

Publication series

Name2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Proceedings

Other

Other2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010
CountryTaiwan
CityTainan
Period10-11-2910-12-03

All Science Journal Classification (ASJC) codes

  • Linguistics and Language

Fingerprint Dive into the research topics of 'Sentence decomplexification using holistic aspect-based clause detection for long sentence understanding'. Together they form a unique fingerprint.

  • Cite this

    Liu, C. H., & Wu, C-H. (2010). Sentence decomplexification using holistic aspect-based clause detection for long sentence understanding. In 2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Proceedings (pp. 265-270). [5684897] (2010 7th International Symposium on Chinese Spoken Language Processing, ISCSLP 2010 - Proceedings). https://doi.org/10.1109/ISCSLP.2010.5684897