TY - GEN
T1 - Deep Learning Acceleration Design Based on Low Rank Approximation
AU - Chang, Yi Hsiang
AU - Lee, Gwo Giun Chris
AU - Chen, Shiu Yu
N1 - Publisher Copyright:
© 2022 Asia-Pacific of Signal and Information Processing Association (APSIPA).
PY - 2022
Y1 - 2022
N2 - Recently, artificial intelligence applications require large resources for training and inferencing. Therefore, intensive computation or large memory requirements often become the bottleneck of AI. This paper proposes the singular value decomposition (SVD) low-rank approximation (LRA) method applied to the CNN model. By exploiting the fact that redundancy exists between different channels and filters, the SVD matrix decomposition is used to estimate the most informative parameters in deep CNNs, and by reducing the convolutional layer parameters in this way, a special structure of the convolutional layers is designed to accelerate the trained neural network, to control the accuracy degradation within 2% but to greatly reduce the data storage and the number of operations. The design process is based on algorithm/architecture co-design, and the analysis of the number of operations and data storage.
AB - Recently, artificial intelligence applications require large resources for training and inferencing. Therefore, intensive computation or large memory requirements often become the bottleneck of AI. This paper proposes the singular value decomposition (SVD) low-rank approximation (LRA) method applied to the CNN model. By exploiting the fact that redundancy exists between different channels and filters, the SVD matrix decomposition is used to estimate the most informative parameters in deep CNNs, and by reducing the convolutional layer parameters in this way, a special structure of the convolutional layers is designed to accelerate the trained neural network, to control the accuracy degradation within 2% but to greatly reduce the data storage and the number of operations. The design process is based on algorithm/architecture co-design, and the analysis of the number of operations and data storage.
UR - http://www.scopus.com/inward/record.url?scp=85146303502&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146303502&partnerID=8YFLogxK
U2 - 10.23919/APSIPAASC55919.2022.9980230
DO - 10.23919/APSIPAASC55919.2022.9980230
M3 - Conference contribution
AN - SCOPUS:85146303502
T3 - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
SP - 1304
EP - 1307
BT - Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
Y2 - 7 November 2022 through 10 November 2022
ER -