Presentation is loading. Please wait.

Presentation is loading. Please wait.

低比特卷积神经网络的量化研究介绍 主讲人:朱锋.

Similar presentations


Presentation on theme: "低比特卷积神经网络的量化研究介绍 主讲人:朱锋."— Presentation transcript:

1 低比特卷积神经网络的量化研究介绍 主讲人:朱锋

2 目 录CONTENTS 1 研究背景 目 录 CONTENTS 2 现有量化研究 3 量化训练加速

3 研究背景

4 • Language understanding Video • Video understanding …
Background Image • Classification • Localization • Segmentation Audio • Speech recognition • Language understanding Video • Video understanding

5 How to get better performance ?
Background How to get better performance ? Complicated Models

6 Challenges of Deploying: • Limited computing resources
Background Challenges of Deploying: • Limited computing resources • Short response time • Millions of parameters • Complicated model architecture Model Architecture Parameters Top-1 ERR Top-5 ERR AlexNet 8 Layers (5conv + 3fc) ~ 60 million 40.7% 15.3% VGG 19 Layers (16conv + 3fc) ~ 144 million 24.4% 7.1% GoogLeNet 22 Layers ~ 6.8 million - 7.9% ResNet 52 Layers (50conv + 2fc) ~ 200 million 21.29% 5.71%

7 现有量化研究

8 Quantization Quantization is the process of constraining an input from a continuous set to a discrete set. Quantization of Neural Network For model param: 32-bit float lower bit int

9 non-uniform quantization
HWGQ(CVPR2017) LQ-Net(ECCV2018) uniform quantization Quantization Dorefa-Net(CVPR2016) PACT(arxiv 2018) binarization BNN(NIPS2016) XNOR-Net(ECCV2016) Bi-Real(ECCV2018)

10 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Zhaowei Cai (UCSD) Xiaodong He(MRR) Jian Sun(Megvii Inc) Nuno Vasconcelos(UCSD) Half-Wave Gaussian Quantization(HWGQ-net)--> Quantization of Activation layer and Bacth Normalization layer. Quantization of activation is more difficult than that of weights, because of activation function.

11 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Binary Networks Binary Activation Quantization Derivative of C with respect to w A problem: derivative almost zero.

12 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Backward use hard tanh Still two problems: 1、gradient varnish 2、gradient mismatch

13 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
ReLU——half-wave rectifier Forward approximation

14 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Half-Wave Gaussian Quantization(HWGQ-net) Backward approximation Vanila ReLU Clipped ReLU

15 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Log-tailed ReLU

16 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments Dataset: ImageNet FW: full precision weight, BW: binary weight

17 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments

18 Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments

19 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization

20 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization

21 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Parameterized Clipping Activation to replace ReLU Quantization STE to get derivative

22 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Parameterized Clipping Activation to replace ReLU

23 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Balancing Clipping and Quantization Error

24 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Statistics-Aware Weight Binning

25 Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Experiment

26 Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic Stochastic Binarization

27 Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic STE- hard tanh XNOR replace mul

28 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Mohammad Rastegariy, Vicente Ordonezy, Joseph Redmon, Ali Farhadiy (University of Washington) Binary-Weight-Networks Estimating Binary Weights

29 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
XNOR-Networks Binary Dot Product Optimal Solution

30 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Experiment

31 XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Experiment AlexNet Dataset: Cifar10 ResNet-18 & GooLenet DataSet: ImageNet

32 Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou (Megvii Inc) activation weight

33 Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
FIRST AND THE LAST LAYER DO NOT QUANTIZE Experiment

34 Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST)

35 Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST) 1、representation capability

36 Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018)

37 Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) 2、gradient mismatch Weight initialization use Clip

38 Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Experiment

39 Bi-Real Net(ECCV 2018) Experiment

40 LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua (Microsoft Research)

41 LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Learnable Quantization Function

42 LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Training Process Two Params to optimize, two step: 1.Fix v learn B 2.Fix B and update v

43 LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Experiment

44 LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Experiment

45 THANKS 谢谢大家聆听


Download ppt "低比特卷积神经网络的量化研究介绍 主讲人:朱锋."

Similar presentations


Ads by Google