低比特卷积神经网络的量化研究介绍 主讲人:朱锋
目 录CONTENTS 1 研究背景 目 录 CONTENTS 2 现有量化研究 3 量化训练加速
研究背景
• Language understanding Video • Video understanding … Background Image • Classification • Localization • Segmentation Audio • Speech recognition • Language understanding Video • Video understanding …
How to get better performance ? Background How to get better performance ? Complicated Models
Challenges of Deploying: • Limited computing resources Background Challenges of Deploying: • Limited computing resources • Short response time • Millions of parameters • Complicated model architecture Model Architecture Parameters Top-1 ERR Top-5 ERR AlexNet 8 Layers (5conv + 3fc) ~ 60 million 40.7% 15.3% VGG 19 Layers (16conv + 3fc) ~ 144 million 24.4% 7.1% GoogLeNet 22 Layers ~ 6.8 million - 7.9% ResNet 52 Layers (50conv + 2fc) ~ 200 million 21.29% 5.71%
现有量化研究
Quantization Quantization is the process of constraining an input from a continuous set to a discrete set. Quantization of Neural Network For model param: 32-bit float lower bit int
non-uniform quantization HWGQ(CVPR2017) LQ-Net(ECCV2018) uniform quantization Quantization Dorefa-Net(CVPR2016) PACT(arxiv 2018) binarization BNN(NIPS2016) XNOR-Net(ECCV2016) Bi-Real(ECCV2018)
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Zhaowei Cai (UCSD) Xiaodong He(MRR) Jian Sun(Megvii Inc) Nuno Vasconcelos(UCSD) Half-Wave Gaussian Quantization(HWGQ-net)--> Quantization of Activation layer and Bacth Normalization layer. Quantization of activation is more difficult than that of weights, because of activation function.
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Binary Networks Binary Activation Quantization Derivative of C with respect to w A problem: derivative almost zero.
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Backward use hard tanh Still two problems: 1、gradient varnish 2、gradient mismatch
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) ReLU——half-wave rectifier Forward approximation
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Half-Wave Gaussian Quantization(HWGQ-net) Backward approximation Vanila ReLU Clipped ReLU
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Log-tailed ReLU
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Experiments Dataset: ImageNet FW: full precision weight, BW: binary weight
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Experiments
Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017) Experiments
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Parameterized Clipping Activation to replace ReLU Quantization STE to get derivative
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Parameterized Clipping Activation to replace ReLU
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Balancing Clipping and Quantization Error
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Statistics-Aware Weight Binning
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN) Experiment
Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic Stochastic Binarization
Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic STE- hard tanh XNOR replace mul
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016) Mohammad Rastegariy, Vicente Ordonezy, Joseph Redmon, Ali Farhadiy (University of Washington) Binary-Weight-Networks Estimating Binary Weights
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016) XNOR-Networks Binary Dot Product Optimal Solution
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016) Experiment
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016) Experiment AlexNet Dataset: Cifar10 ResNet-18 & GooLenet DataSet: ImageNet
Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou (Megvii Inc) activation weight
Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients FIRST AND THE LAST LAYER DO NOT QUANTIZE Experiment
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST)
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST) 1、representation capability
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018)
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) 2、gradient mismatch Weight initialization use Clip
Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Experiment
Bi-Real Net(ECCV 2018) Experiment
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018) Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua (Microsoft Research)
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018) Learnable Quantization Function
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018) Training Process Two Params to optimize, two step: 1.Fix v learn B 2.Fix B and update v
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018) Experiment
LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018) Experiment
THANKS 谢谢大家聆听