Recently, artificial intelligence applications require large resources for training and inferencing. Therefore, intensive computation or large memory requirements often become the bottleneck of AI. This paper proposes the singular value decomposition (SVD) low-rank approximation (LRA) method applied to the CNN model. By exploiting the fact that redundancy exists between different channels and filters, the SVD matrix decomposition is used to estimate the most informative parameters in deep CNNs, and by reducing the convolutional layer parameters in this way, a special structure of the convolutional layers is designed to accelerate the trained neural network, to control the accuracy degradation within 2% but to greatly reduce the data storage and the number of operations. The design process is based on algorithm/architecture co-design, and the analysis of the number of operations and data storage.