Download presentation
Presentation is loading. Please wait.
1
低比特卷积神经网络的量化研究介绍 主讲人:朱锋
2
目 录CONTENTS 1 研究背景 目 录 CONTENTS 2 现有量化研究 3 量化训练加速
3
研究背景
4
• Language understanding Video • Video understanding …
Background Image • Classification • Localization • Segmentation Audio • Speech recognition • Language understanding Video • Video understanding …
5
How to get better performance ?
Background How to get better performance ? Complicated Models
6
Challenges of Deploying: • Limited computing resources
Background Challenges of Deploying: • Limited computing resources • Short response time • Millions of parameters • Complicated model architecture Model Architecture Parameters Top-1 ERR Top-5 ERR AlexNet 8 Layers (5conv + 3fc) ~ 60 million 40.7% 15.3% VGG 19 Layers (16conv + 3fc) ~ 144 million 24.4% 7.1% GoogLeNet 22 Layers ~ 6.8 million - 7.9% ResNet 52 Layers (50conv + 2fc) ~ 200 million 21.29% 5.71%
7
现有量化研究
8
Quantization Quantization is the process of constraining an input from a continuous set to a discrete set. Quantization of Neural Network For model param: 32-bit float lower bit int
9
non-uniform quantization
HWGQ(CVPR2017) LQ-Net(ECCV2018) uniform quantization Quantization Dorefa-Net(CVPR2016) PACT(arxiv 2018) binarization BNN(NIPS2016) XNOR-Net(ECCV2016) Bi-Real(ECCV2018)
10
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Zhaowei Cai (UCSD) Xiaodong He(MRR) Jian Sun(Megvii Inc) Nuno Vasconcelos(UCSD) Half-Wave Gaussian Quantization(HWGQ-net)--> Quantization of Activation layer and Bacth Normalization layer. Quantization of activation is more difficult than that of weights, because of activation function.
11
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Binary Networks Binary Activation Quantization Derivative of C with respect to w A problem: derivative almost zero.
12
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Backward use hard tanh Still two problems: 1、gradient varnish 2、gradient mismatch
13
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
ReLU——half-wave rectifier Forward approximation
14
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Half-Wave Gaussian Quantization(HWGQ-net) Backward approximation Vanila ReLU Clipped ReLU
15
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Log-tailed ReLU
16
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments Dataset: ImageNet FW: full precision weight, BW: binary weight
17
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments
18
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Experiments
19
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization
20
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization
21
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Parameterized Clipping Activation to replace ReLU Quantization STE to get derivative
22
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Parameterized Clipping Activation to replace ReLU
23
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Balancing Clipping and Quantization Error
24
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Statistics-Aware Weight Binning
25
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Experiment
26
Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic Stochastic Binarization
27
Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic STE- hard tanh XNOR replace mul
28
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Mohammad Rastegariy, Vicente Ordonezy, Joseph Redmon, Ali Farhadiy (University of Washington) Binary-Weight-Networks Estimating Binary Weights
29
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
XNOR-Networks Binary Dot Product Optimal Solution
30
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Experiment
31
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Experiment AlexNet Dataset: Cifar10 ResNet-18 & GooLenet DataSet: ImageNet
32
Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou (Megvii Inc) activation weight
33
Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
FIRST AND THE LAST LAYER DO NOT QUANTIZE Experiment
34
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST)
35
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST) 1、representation capability
36
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018)
37
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) 2、gradient mismatch Weight initialization use Clip
38
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Experiment
39
Bi-Real Net(ECCV 2018) Experiment
40
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua (Microsoft Research)
41
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Learnable Quantization Function
42
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Training Process Two Params to optimize, two step: 1.Fix v learn B 2.Fix B and update v
43
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Experiment
44
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Experiment
45
THANKS 谢谢大家聆听
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.