A tucker decomposition based knowledge distillation for intelligent edge applications

Cheng Dai, Xingang Liu, Zhuolin Li, Mu Yen Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Knowledge distillation(KD) has been proven an effective method in intelligent edge computing and have achieved extensive study in recent deep learning research. However, when the teacher network is too stronger compared to the student network, the effect of knowledge distillation is not ideal. Aiming at resolving this problem, an improved method of knowledge distillation (TDKD) is proposed, which enables to transfer the complex mapping functions learned by cumbersome models to relatively simpler models. Firstly, the tucker-2 decomposition was performed on the convolutional layers of the original teacher model to reduce the capacity variance between the teacher network and student network. Then, the decomposed model will be used as a new teacher to participate in knowledge distillation for the student model. The experimental results show that the TDKD method can effectively solve the problem of poor distillation performance, which not only get better results if the KD method is effective, but also can reactivate the invalid KD method to some extents.

Original languageEnglish
Article number107051
JournalApplied Soft Computing
Volume101
DOIs
Publication statusPublished - 2021 Mar

All Science Journal Classification (ASJC) codes

  • Software

Fingerprint Dive into the research topics of 'A tucker decomposition based knowledge distillation for intelligent edge applications'. Together they form a unique fingerprint.

Cite this