A Two-Stage Transformer-Based Approach for Variable-Length Abstractive Summarization

Research output: Contribution to journalArticle

Abstract

This study proposes a two-stage method for variable-length abstractive summarization. This is an improvement over previous models, in that the proposed approach can simultaneously achieve fluent and variable-length abstractive summarization. The proposed abstractive summarization model consists of a text segmentation module and a two-stage Transformer-based summarization module. First, the text segmentation module utilizes a pre-trained Bidirectional Encoder Representations from Transformers (BERT) and a bidirectional long short-term memory (LSTM) to divide the input text into segments. An extractive model based on the BERT-based summarization model (BERTSUM) is then constructed to extract the most important sentence from each segment. For training the two-stage summarization model, first, the extracted sentences are used to train the document summarization module in the second stage. Next, the segments are used to train the segment summarization module in the first stage by simultaneously considering the outputs of the segment summarization module and the pre-trained second-stage document summarization module. The parameters of the segment summarization module are updated by considering the loss scores of the document summarization module as well as the segment summarization module. Finally, collaborative training is applied to alternately train the segment summarization module and the document summarization module until convergence. For testing, the outputs of the segment summarization module are concatenated to provide the variable-length abstractive summarization result. For evaluation, the BERT-biLSTM-based text segmentation model is evaluated using ChWiki_181k database and obtains a good effect in capturing the relationship between sentences. Finally, the proposed variable-length abstractive summarization system achieved a maximum of 70.0% accuracy in human subjective evaluation on the LCSTS dataset.

Original languageEnglish
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
DOIs
Publication statusAccepted/In press - 2020

All Science Journal Classification (ASJC) codes

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'A Two-Stage Transformer-Based Approach for Variable-Length Abstractive Summarization'. Together they form a unique fingerprint.

  • Cite this