低比特卷积神经网络的量化研究介绍主讲人：朱锋.

低比特卷积神经网络的量化研究介绍主讲人：朱锋

目录CONTENTS 1 研究背景目录 CONTENTS 2 现有量化研究 3 量化训练加速

研究背景

• Language understanding Video • Video understanding …
Background Image • Classification • Localization • Segmentation Audio • Speech recognition • Language understanding Video • Video understanding …

How to get better performance ？
Background How to get better performance ？ Complicated Models

Challenges of Deploying： • Limited computing resources
Background Challenges of Deploying： • Limited computing resources • Short response time • Millions of parameters • Complicated model architecture Model Architecture Parameters Top-1 ERR Top-5 ERR AlexNet 8 Layers (5conv + 3fc) ~ 60 million 40.7% 15.3% VGG 19 Layers (16conv + 3fc) ~ 144 million 24.4% 7.1% GoogLeNet 22 Layers ~ 6.8 million - 7.9% ResNet 52 Layers (50conv + 2fc) ~ 200 million 21.29% 5.71%

现有量化研究

Quantization Quantization is the process of constraining an input from a continuous set to a discrete set. Quantization of Neural Network For model param: 32-bit float lower bit int

non-uniform quantization
HWGQ（CVPR2017） LQ-Net（ECCV2018） uniform quantization Quantization Dorefa-Net（CVPR2016） PACT（arxiv 2018） binarization BNN（NIPS2016） XNOR-Net（ECCV2016） Bi-Real（ECCV2018）

Deep Learning with Low Precision by Half-wave Gaussian Quantization (CVPR 2017)
Zhaowei Cai (UCSD) Xiaodong He(MRR) Jian Sun(Megvii Inc) Nuno Vasconcelos(UCSD) Half-Wave Gaussian Quantization（HWGQ-net）--> Quantization of Activation layer and Bacth Normalization layer. Quantization of activation is more difficult than that of weights, because of activation function.

Binary Networks Binary Activation Quantization Derivative of C with respect to w A problem: derivative almost zero.

Backward use hard tanh Still two problems： 1、gradient varnish 2、gradient mismatch

ReLU——half-wave rectifier Forward approximation

Half-Wave Gaussian Quantization（HWGQ-net） Backward approximation Vanila ReLU Clipped ReLU

Log-tailed ReLU

Experiments Dataset: ImageNet FW: full precision weight, BW: binary weight

Experiments

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan IBM Research AI Challenge in Activation Quantization

Parameterized Clipping Activation to replace ReLU Quantization STE to get derivative

Parameterized Clipping Activation to replace ReLU

Balancing Clipping and Quantization Error

Statistics-Aware Weight Binning

Experiment

Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic Stochastic Binarization

Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1(NIPS 2016) Matthieu Courbariaux, Yoshua Bengio(Universite de Montreal) Deterministic STE- hard tanh XNOR replace mul

XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks(ECCV 2016)
Mohammad Rastegariy, Vicente Ordonezy, Joseph Redmon, Ali Farhadiy (University of Washington) Binary-Weight-Networks Estimating Binary Weights

XNOR-Networks Binary Dot Product Optimal Solution

Experiment

Experiment AlexNet Dataset: Cifar10 ResNet-18 & GooLenet DataSet: ImageNet

Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou (Megvii Inc) activation weight

Dorefa-Net: Training Low Bitwidth Convolutional Neural Networks With Low Bitwidth Gradients
FIRST AND THE LAST LAYER DO NOT QUANTIZE Experiment

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST)

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Zechun Liu(HKUST), Baoyuan Wu(Tencent AI), Wenhan Luo(Tencen t AI), Xin Yang(HUST), Wei Liu(Tencent AI), and Kwang-Ting Cheng(HKUST) 1、representation capability

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018)

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) 2、gradient mismatch Weight initialization use Clip

Bi-Real Net: Enhancing the Performance of 1-bit CNNs With Improved Representational Capability and Advanced Training Algorithm(ECCV 2018) Experiment

Bi-Real Net(ECCV 2018) Experiment

LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks (ECCV 2018)
Dongqing Zhang, Jiaolong Yang, Dongqiangzi Ye, and Gang Hua (Microsoft Research)

Learnable Quantization Function

Training Process Two Params to optimize, two step： 1.Fix v learn B 2.Fix B and update v

Experiment

THANKS 谢谢大家聆听

低比特卷积神经网络的量化研究介绍主讲人：朱锋.

Similar presentations

Presentation on theme: "低比特卷积神经网络的量化研究介绍主讲人：朱锋."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

低比特卷积神经网络的量化研究介绍 主讲人：朱锋.

Similar presentations

Presentation on theme: "低比特卷积神经网络的量化研究介绍 主讲人：朱锋."— Presentation transcript:

Similar presentations

About project

Feedback

低比特卷积神经网络的量化研究介绍主讲人：朱锋.

Presentation on theme: "低比特卷积神经网络的量化研究介绍主讲人：朱锋."— Presentation transcript: