Download presentation
Presentation is loading. Please wait.
1
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772
2
Project Objectives Implementing Neural Network on FPGA Implementing Neural Network on FPGA Creating modular design Implementing in software (Matlab) Creating PC Interface Performance Analyze: Performance Analyze: Area on chip Interconnections Speed vs. software implementation Frequency Cost
3
System modulation Input image size is up to 1024 binary pixels. Input image size is up to 1024 binary pixels. Number of neurons in the hidden layer varies from 2 to 10. Number of neurons in the hidden layer varies from 2 to 10. Number of sub-networks varies Number of sub-networks varies up to 31 any objects. up to 31 any objects. Our Application: Input length 64 pixels (8x8) Objects to classify are digits.
4
System Parallelizing Dividing complex fully-connected network into sub-networks. Dividing complex fully-connected network into sub-networks. 10 sub-networks running concurrently. 10 sub-networks running concurrently. Up to 10 neurons run concurrently in each sub- network. Up to 10 neurons run concurrently in each sub- network. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Parallel calculations of output layer Parallel calculations of output layer
5
System Interface Inputs Inputs Binary image ( up to 1024 pixels ) Weights – 13 bits width Sigmoid function values – 10 bit width Outputs Outputs Vector size of 5 bits coding the recognized object or indicating a failure.
6
Hardware Description XilinX ML310 Development Board XilinX ML310 Development Board RS232 Standard - FPGA UART Transmission rate is 115,200 bits/sec optimally VirtexII-Pro XC2VP30 FPGA 2 PowerPC 405 Core - 300+ MHz 2,448 Kbytes of BRAM memories 136 18x18 bits multipliers 30,816 Logic Cells Up to 111,232 internal registers Up to 111,232 LUTS 256 MB DDR DIMM
7
System Description Power PC Weights memory Net blocks UART Input memory Sigmoid memory
8
Control Flow Get Sigmoid Get Weights Get NN Description Get Input Image Load Biases Calculate (Hidden Layer) φ(.) Calculate φ(.) (Hidden Layer) Calculate maximal output and generate Output vector φ(.) Calculate φ(.) (Output Layer) Have all sub-networks done? Send the result to user IDLE Calculate (Output Layer) Yes No
9
Architecture – Complete Network 10 Net-Blocks working concurrently Each Net-Block calculates 1 sub-networks. Parallel input into each block from Input Memories. Maximum Unit - producing output vector coding the recognized digit. Net block 1 Max Net block 10 8 bits X_vector W_vector 5 bit 5 bits X_vector-5 pixels W_vector- 5pixels
10
Architecture – The Net Block 10 Hardcore Multipliers and 10 Accumulators 10 Hardcore Multipliers and 1 Adder 1 Sigmoid Memory mult1 mult2 mult10 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder φ(.)
11
Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 10 neurons in the hidden layer (in parallel: 10 neurons getting input) φ(.) W1_1 W1_2 W1_10 W’10 W’2 W’1
12
Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 2 neurons in the hidden layer (in parallel: 2 neurons getting input, 5 pixels) x5 φ(.) W1_1 W5_2 W5_1 W1_2 W’1 W’2 “0”
13
Architecture - Memories Inputs Memories: Inputs Memories: Up to 1024 pixels : Array of 5 banks each 205 bitArray of 5 banks each 205 bit Weights Memories: Weights Memories: Hidden Layer: 10 memory blocks for every net-block Arrayed in 10 banks each 1024*13 = 1,664 bytesArrayed in 10 banks each 1024*13 = 1,664 bytes Output Layer: Array of 10 registers each 13 bit width Array of 10 registers each 13 bit width Bias weights: 10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width Sigmoid Memory: Sigmoid Memory: 10 memory blocks quantizing sigmoid values each: 13 bits input representing values [-16,16]13 bits input representing values [-16,16] 10 bits output representing values [0,1]10 bits output representing values [0,1] 1 Kbyte 100 Kbytes ~167 Kbytes
14
Schedule Date:Assignment 27.12Midterm presentation 19.12-25.12 Continue planning the FSM Controller 26.12-21.01Implementing the NN in VHDL Integration of the NN, PowerPC and Memories Simulating the NN Synthesis 23.01-28.01Analyzing performance Writing project book 21.02Final presentation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.