Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization for Fully Connected Neural Network for FPGA application

Similar presentations


Presentation on theme: "Optimization for Fully Connected Neural Network for FPGA application"— Presentation transcript:

1 Optimization for Fully Connected Neural Network for FPGA application
Author: Chen Chen ECE 734 Project Presentation

2 Main Content: FC MAC optimization: Distributed Arithmetic; Approximate Computing; Memory optimization: Connection Reduction; Intend to know about how CNN works, right now do more specific

3 Distributed Arithmetic
Bit level parallel processing; Platform: MATLAB 16 pairs of 8 bits input and weight are parallel processed in one cycle. Elapsed operation time : s vs s Verilog Form: R1[0] = (B1[0])? A1 : 8’b0; … R1[7] = (B1[7])?A1 : 8’b0; … R16[0] = (B16[7])?A16 : 8’b0; Sum[0] = R1[0] + R2[0] + … + R16[0]; shift 0 bit; (15 adder,16 mux … shifter) Sum[7] = R1[7] + R2[7] + … + R16[7]; shift 7 bit;

4 Approximate computing
Energy harvest system : Compute MSB first, then LSB to increase system efficiency. E.g. 16X16 multiplication -> 4 times 16X4 multiplication Drop 1 LSB bit, save 16 mux, 15 adder and 1 shifter

5 Connection Reduction Verilog form: Design specific flag to represent disconnection to skip the memory storing process

6 Experiment Implementation
Existing program : Kaggle’s Digit Recognizer Function: handwritten digits classification (0 to 9) Based CNN architecture : Tensorflow, two hidden layers, two FC layers. Training & testing dataset: MINIST 28*28 grayscale image

7 Original source code training & testing results Training 20 times for upper bound accuracy
10 times: times: times:

8 Approximate computing Result
Loss 1 decimal place Loss 2 decimal place Convert to integer for hidden keep 7 decimal places keep 6 decimal places layer, only keep integer part

9 First time training: connection rate 25 %
Connection Reduction First time training: connection rate 25 % Reinitialization (global initialization) , disconnection then training. : connection rate 23 % Future work: Located totally disconnected node and save component’s area. Better method to reduce the connection while maintaining high accuracy.

10 Lesson learning Learn several system design and optimization algorithms and methods Learn famous technologies of different fields, communication, image, and so on Know about the operation process of CNN based system Know how to apply online package to improve the project Try to analyze the problem instead of just judge it’s right or not.


Download ppt "Optimization for Fully Connected Neural Network for FPGA application"

Similar presentations


Ads by Google