Advanced Search
Huang Zhengzhe, Du Huimin, Chang Libo. Mixed-Clipping Quantization for Convolutional Neural Networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509
Citation: Huang Zhengzhe, Du Huimin, Chang Libo. Mixed-Clipping Quantization for Convolutional Neural Networks[J]. Journal of Computer-Aided Design & Computer Graphics, 2021, 33(4): 553-559. DOI: 10.3724/SP.J.1089.2021.18509

Mixed-Clipping Quantization for Convolutional Neural Networks

  • Quantization is the main method to compress convolutional neural networks and accelerate convolutional neural network inference.Most existing quantization methods quantize all layers to the same bit width.Mixed-precision quantization can obtain higher precision under the same compression ratio,but it is difficult to find a mixed-precision quantization strategy.To solve this problem,a mixed-clipping quantization method based on reinforcement learning is proposed.It uses reinforcement learning to search for a mixed-precision quantization strategy,and uses a mixed-clipping method to clip weight data according to the searched quantization strategy before quantization.This method further improves the accuracy of the quantized network.We extensively test this method on a diverse set of models,including ResNet18/50,Mobile-Net-V2 on ImageNet,as well as YOLOV3 on the Microsoft COCO dataset.The experimental results show that our method can achieve 2.7%and 0.3%higher Top-1 accuracy on MobileNet-V2(4 bit),as compared to the HAQ and ZeroQ method.And our method can achieve 2.6%higher mAP on YOLOV3(6 bit),as compared to per-layer quantization method.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return