Mixed-Clipping Quantization for Convolutional Neural Networks

Huang Zhengzhe; Du Huimin; Chang Libo

doi:10.3724/SP.J.1089.2021.18509

Huang Zhengzhe, Du Huimin, Chang Libo. Mixed-Clipping Quantization for Convolutional Neural Networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509

Citation:

Mixed-Clipping Quantization for Convolutional Neural Networks

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Quantization is the main method to compress convolutional neural networks and accelerate convolutional neural network inference.Most existing quantization methods quantize all layers to the same bit width.Mixed-precision quantization can obtain higher precision under the same compression ratio,but it is difficult to find a mixed-precision quantization strategy.To solve this problem,a mixed-clipping quantization method based on reinforcement learning is proposed.It uses reinforcement learning to search for a mixed-precision quantization strategy,and uses a mixed-clipping method to clip weight data according to the searched quantization strategy before quantization.This method further improves the accuracy of the quantized network.We extensively test this method on a diverse set of models,including ResNet18/50,Mobile-Net-V2 on ImageNet,as well as YOLOV3 on the Microsoft COCO dataset.The experimental results show that our method can achieve 2.7%and 0.3%higher Top-1 accuracy on MobileNet-V2(4 bit),as compared to the HAQ and ZeroQ method.And our method can achieve 2.6%higher mAP on YOLOV3(6 bit),as compared to per-layer quantization method.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

Mixed-Clipping Quantization for Convolutional Neural Networks

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content