Presentation is loading. Please wait.

Presentation is loading. Please wait.

FPGA implementation of CNN Convolution layer logic

Similar presentations


Presentation on theme: "FPGA implementation of CNN Convolution layer logic"— Presentation transcript:

1 FPGA implementation of CNN Convolution layer logic
Di Wu Role. Directory based cache,

2 Introduction to CNN Deep learning network
Suitable for image recognition and classification Feature extraction layer Convolutional computation Pooling layer Down sampling Linear Classify layer Classic ANN network

3 CNN convolutional layer
Nested For Loop // Each output kernel can be calculated separately  for (to=0;to<M;to++) { // Each input layer is first calculated separately and then summed up for (ti=0;ti<N;ti++) { // The inner convolutional part of one input image for (row=0;row<R;row++) { for (col=0;col<C;col++) { for (i=0;i<K;i++) { for (j=0;j<K;j++) { output_fm[to][row][col] += weights[to][ti][i][j]*input_fm[ti][row+i][col+j]; } } } } } }

4 CNN convolutional layer
Intuition: Systolic Array Implementation Coupled with each cache entries in LLC remain Invalid

5 Systolic Array Implementation
Derive the systolic array for each line of input source Data dependency graph

6 Systolic Array Implementation
1D systolic array structure for partial convolution computation 2D systolic array structure for partial convolution computation 2D systolic array structure for convolution computation with final adder stage.

7 Synthesis and P&R result
32 bit fixed point representation Kernel size 4x4, input channel = 3, fifo size = 1024 Xilinx xq7z100-rf1156 50MHz LUT / 0.456% FF 3389/0.611% BRAM 12/1.589% DSP 288/14.257%

8 Discussion FPGA Utilization issue Bandwidth issue
DSP slice resource Bandwidth issue 32-bit fixed point representation, 50MHz clock, input channel = 3 Bandwidth requirement is 600MBps

9 Q & A


Download ppt "FPGA implementation of CNN Convolution layer logic"

Similar presentations


Ads by Google