Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772

Project Objectives Implementing Neural Network on FPGA Implementing Neural Network on FPGA  Creating modular design  Implementing in software (Matlab)  Creating PC Interface Performance Analyze: Performance Analyze:  Area on chip  Interconnections  Speed vs. software implementation  Frequency  Cost

System modulation Input image size is up to 1024 binary pixels. Input image size is up to 1024 binary pixels. Number of neurons in the hidden layer varies from 2 to 10. Number of neurons in the hidden layer varies from 2 to 10. Number of sub-networks varies Number of sub-networks varies up to 31 any objects. up to 31 any objects. Our Application:  Input length 64 pixels (8x8)  Objects to classify are digits.

System Parallelizing Dividing complex fully-connected network into sub-networks. Dividing complex fully-connected network into sub-networks. 10 sub-networks running concurrently. 10 sub-networks running concurrently. Up to 10 neurons run concurrently in each sub- network. Up to 10 neurons run concurrently in each sub- network. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Parallel calculations of output layer Parallel calculations of output layer

System Interface Inputs Inputs  Binary image ( up to 1024 pixels )  Weights – 13 bits width  Sigmoid function values – 10 bit width Outputs Outputs  Vector size of 5 bits coding the recognized object or indicating a failure.

Hardware Description XilinX ML310 Development Board XilinX ML310 Development Board  RS232 Standard - FPGA UART  Transmission rate is 115,200 bits/sec optimally  VirtexII-Pro XC2VP30 FPGA  2 PowerPC 405 Core - 300+ MHz  2,448 Kbytes of BRAM memories  136 18x18 bits multipliers  30,816 Logic Cells  Up to 111,232 internal registers  Up to 111,232 LUTS  256 MB DDR DIMM

System Description Power PC Weights memory Net blocks UART Input memory Sigmoid memory

Control Flow Get Sigmoid Get Weights Get NN Description Get Input Image Load Biases Calculate (Hidden Layer) φ(.) Calculate φ(.) (Hidden Layer) Calculate maximal output and generate Output vector φ(.) Calculate φ(.) (Output Layer) Have all sub-networks done? Send the result to user IDLE Calculate (Output Layer) Yes No

Architecture – Complete Network  10 Net-Blocks working concurrently  Each Net-Block calculates 1 sub-networks.  Parallel input into each block from Input Memories.  Maximum Unit - producing output vector coding the recognized digit. Net block 1 Max Net block 10 8 bits X_vector W_vector 5 bit 5 bits X_vector-5 pixels W_vector- 5pixels

Architecture – The Net Block 10 Hardcore Multipliers and 10 Accumulators  10 Hardcore Multipliers and 1 Adder  1 Sigmoid Memory mult1 mult2 mult10 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder φ(.)

Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 10 neurons in the hidden layer (in parallel: 10 neurons getting input) φ(.) W1_1 W1_2 W1_10 W’10 W’2 W’1

Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 2 neurons in the hidden layer (in parallel: 2 neurons getting input, 5 pixels) x5 φ(.) W1_1 W5_2 W5_1 W1_2 W’1 W’2 “0”

Architecture - Memories Inputs Memories: Inputs Memories:  Up to 1024 pixels : Array of 5 banks each 205 bitArray of 5 banks each 205 bit Weights Memories: Weights Memories:  Hidden Layer: 10 memory blocks for every net-block Arrayed in 10 banks each 1024*13 = 1,664 bytesArrayed in 10 banks each 1024*13 = 1,664 bytes  Output Layer: Array of 10 registers each 13 bit width Array of 10 registers each 13 bit width  Bias weights: 10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width Sigmoid Memory: Sigmoid Memory:  10 memory blocks quantizing sigmoid values each: 13 bits input representing values [-16,16]13 bits input representing values [-16,16] 10 bits output representing values [0,1]10 bits output representing values [0,1] 1 Kbyte 100 Kbytes ~167 Kbytes

Schedule Date:Assignment 27.12Midterm presentation 19.12-25.12 Continue planning the FSM Controller 26.12-21.01Implementing the NN in VHDL Integration of the NN, PowerPC and Memories Simulating the NN Synthesis 23.01-28.01Analyzing performance Writing project book 21.02Final presentation

Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Similar presentations

Presentation on theme: "Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Similar presentations

Presentation on theme: "Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772."— Presentation transcript:

Similar presentations

About project

Feedback