A tucker decomposition based knowledge distillation for intelligent edge applications

Cheng Dai, Xingang Liu, Zhuolin Li, Mu Yen Chen

研究成果: Article同行評審

摘要

Knowledge distillation(KD) has been proven an effective method in intelligent edge computing and have achieved extensive study in recent deep learning research. However, when the teacher network is too stronger compared to the student network, the effect of knowledge distillation is not ideal. Aiming at resolving this problem, an improved method of knowledge distillation (TDKD) is proposed, which enables to transfer the complex mapping functions learned by cumbersome models to relatively simpler models. Firstly, the tucker-2 decomposition was performed on the convolutional layers of the original teacher model to reduce the capacity variance between the teacher network and student network. Then, the decomposed model will be used as a new teacher to participate in knowledge distillation for the student model. The experimental results show that the TDKD method can effectively solve the problem of poor distillation performance, which not only get better results if the KD method is effective, but also can reactivate the invalid KD method to some extents.

原文English
文章編號107051
期刊Applied Soft Computing
101
DOIs
出版狀態Published - 2021 三月

All Science Journal Classification (ASJC) codes

  • 軟體

指紋

深入研究「A tucker decomposition based knowledge distillation for intelligent edge applications」主題。共同形成了獨特的指紋。

引用此