Variational Channel Distribution Pruning and Mixed-Precision Quantization for Neural Network Model Compression

Wan Ting Chang, Chih Hung Kuo, Li Chun Fang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a model compression frame-work for both pruning and quantizing according to the channel distribution information. We apply the variational inference technique to train a Bayesian deep neural network, in which the parameters are modeled by probability distributions. According to the characteristic of the probability distribution, we can prune the redundant channels and determine the bit-width layer by layer. The experiments conducted on the CIFAR10 dataset with the VGG16 show that the number of parameters can be saved by 58.91x. The proposed compression approach can help implement hardware circuits for efficient edge and mobile computing.

Original languageEnglish
Title of host publication2022 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665409216
DOIs
Publication statusPublished - 2022
Event2022 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2022 - Hsinchu, Taiwan
Duration: 2022 Apr 182022 Apr 21

Publication series

Name2022 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2022 - Proceedings

Conference

Conference2022 International Symposium on VLSI Design, Automation and Test, VLSI-DAT 2022
Country/TerritoryTaiwan
CityHsinchu
Period22-04-1822-04-21

All Science Journal Classification (ASJC) codes

  • Hardware and Architecture
  • Electrical and Electronic Engineering
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Variational Channel Distribution Pruning and Mixed-Precision Quantization for Neural Network Model Compression'. Together they form a unique fingerprint.

Cite this