Pan, Jianhong and Yang, Siyuan and Foo, Lin Geng and Ke, Qiuhong and Rahmani, Hossein and Fan, Zhipeng and Liu, Jun (2024) Progressive Channel-Shrinking Network. IEEE Transactions on Multimedia, 26. pp. 2016-2026. ISSN 1520-9210
2304.00280.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (3MB)
Abstract
Currently, salience-based channel pruning makes continuous breakthroughs in network compression. In the realization, the salience mechanism is used as a metric of channel salience to guide pruning. Therefore, salience-based channel pruning can dynamically adjust the channel width at run-time, which provides a flexible pruning scheme. However, there are two problems emerging: a gating function is often needed to truncate the specific salience entries to zero, which destabilizes the forward propagation; dynamic architecture brings more cost for indexing in inference which bottlenecks the inference speed. In this paper, we propose a Progressive Channel-Shrinking (PCS) method to compress the selected salience entries at run-time instead of roughly approximating them to zero. We also propose a Running Shrinking Policy to provide a testing-static pruning scheme that can reduce the memory access cost for filter indexing. We evaluate our method on ImageNet and CIFAR10 datasets over two prevalent networks: ResNet and VGG, and demonstrate that our PCS outperforms all baselines and achieves state-of-the-art in terms of compression-performance tradeoff. Moreover, we observe a significant and practical acceleration of inference. The code will be released upon acceptance.