Knowledge Distillation on Extractive Summarization

Ying Jia Lin, Daniel Tan, Tzu Hsuan Chou, Hung Yu Kao, Hsin Yang Wang

研究成果: Conference contribution

摘要

Large-scale pre-trained frameworks have shown state-of-the-art performance in several natural language processing tasks. However, the costly training and inference time are great challenges when deploying such models to real-world applications. In this work, we conduct an empirical study of knowledge distillation on an extractive text summarization task. We first utilized a pre-trained model as the teacher model for extractive summarization and extracted learned knowledge from it as soft targets. Then, we leveraged both the hard targets and the soft targets as the objective for training a much smaller student model to perform extractive summarization. Our results show the student model performs only 1 point lower in the three ROUGE scores on the CNN/DM dataset of extractive summarization while being 40% smaller than the teacher model and 50% faster in terms of the inference time.

原文English
主出版物標題Proceedings - 2020 IEEE 3rd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020
發行者Institute of Electrical and Electronics Engineers Inc.
頁面71-76
頁數6
ISBN(電子)9781728187082
DOIs
出版狀態Published - 2020 12月
事件3rd IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020 - Irvine, United States
持續時間: 2020 12月 92020 12月 11

出版系列

名字Proceedings - 2020 IEEE 3rd International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020

Conference

Conference3rd IEEE International Conference on Artificial Intelligence and Knowledge Engineering, AIKE 2020
國家/地區United States
城市Irvine
期間20-12-0920-12-11

All Science Journal Classification (ASJC) codes

  • 人工智慧
  • 電腦網路與通信
  • 硬體和架構
  • 訊號處理
  • 資訊系統與管理

指紋

深入研究「Knowledge Distillation on Extractive Summarization」主題。共同形成了獨特的指紋。

引用此