Download presentation
Presentation is loading. Please wait.
1
Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772
2
Project Objectives Implementing Neural Network on FPGA Implementing Neural Network on FPGA Creating modular design Implementing in software (Matlab) Creating PC Interface Performance Analyze: Performance Analyze: Area on chip Interconnections Speed vs. software implementation Frequency Cost
3
Project’s Part A Objectives Implementing a single neuron in VHDL. Implementing a single neuron in VHDL. Researching and integrating into EDK environment and running the design on FPGA. Researching and integrating into EDK environment and running the design on FPGA. Implementing the feed forward calculation. Implementing the feed forward calculation. Implementing the learning in Matlab. Implementing the learning in Matlab. Building a Graphical User Interface for friendly communication with the system. Building a Graphical User Interface for friendly communication with the system.
4
Testing Application Single neuron can separate two regions by linear line. Single neuron can separate two regions by linear line. There is need for a multi layered network to recognize an image. There is need for a multi layered network to recognize an image. Implementing and/or functions: Implementing and/or functions: (0,0) (0,1) (1,0) (1,1) (0,0) (0,1) (1,0) (1,1) AND Function OR Function
5
Learning in Matlab Implementing a NN using logsig() activation function and ‘traingdx’ training algorithm. Implementing a NN using logsig() activation function and ‘traingdx’ training algorithm. Providing a Truth Table for the binary functions AND/OR as a training set. Providing a Truth Table for the binary functions AND/OR as a training set. % Build the NN temp = size(inputs_vec); in_range = zeros(temp(1),2); in_range(:,2) = 1; net = newff(in_range,[1],{'logsig'},'traingdx'); % Train the NN net.TrainParam.epochs = epochs; net.TrainParam.goal = error; net = train(net,inputs_vec,target_vec); Sigmoid Function:
6
Hardware Description XilinX ML310 Development Board XilinX ML310 Development Board RS232 Standard - FPGA UART Transmission rate is 115,200 bits/sec optimally VirtexII-Pro XC2VP30 FPGA 2 PowerPC 405 Core - 300+ MHz 2,448 Kbytes of BRAM memories 136 18x18 bits multipliers 30,816 Logic Cells Up to 111,232 internal registers Up to 111,232 LUTS 256 MB DDR DIMM
7
System Interface Inputs Inputs Binary number ( up to 1024 bits) Weights – 13 bits width Fixed Point Presentation: 1 sign bit 4 integer bits 8 fraction bits Sigmoid function values – 8 bit width Outputs Outputs Two bits – neuron’s binary result on the input number or failure detection.
8
System Description Power PC Weights memory Single Neuron UART Input memory Sigmoid memory PLBPLB OPBOPB PLB2OPB Bridge
9
EDK Integration PPC writes the BRAMS and controls Single Neuron through the PLB PPC writes the BRAMS and controls Single Neuron through the PLB Single Neuron connected to PLB as an User Core IPIF. Single Neuron connected to PLB as an User Core IPIF. Memories: Memories: PORT1: Connected to PLB as IPIF PORT2: Connected to Single Neuron directly UART (Serial Port) is connected to OPB. UART (Serial Port) is connected to OPB.
10
Control Flow Get WeightsGet Sigmoid Load Input number Load Bias Calculate φ(.) Calculate φ(.) Calculate output bitsSend the result to user IDLE Load decision values Get Inputs Wait for loading Bias
11
Architecture – Single Neuron Multiplier 1x13 bits Multiplier 1x13 bits Accumulator 13 bits width Accumulator 13 bits width FSM Controller FSM Controller MULT Accumulator REG REGREGREG Comparator Comparator Min Decision Max Decision Bias Weight X[i] W[i] Y v logsig(v) bias/max/min/inputs_num REG Bias/Min/Max/Inputs_num Registers Bias/Min/Max/Inputs_num Registers Comparator: Comparator:
12
Architecture – Memories (1) 2-Port BRAMS with separate clocks. 2-Port BRAMS with separate clocks. Special sized BRAMS generated by the Xilinx Core Generator. Special sized BRAMS generated by the Xilinx Core Generator. VHDL SRAM controller wrapping VHDL SRAM controller wrapping Inputs Memory: Inputs Memory: Up to 1024 binary bits 1 Kbyte
13
Architecture – Memories (2) Weights Memory: Weights Memory: 1024*13bits =13,312 bits =1,664 bytes1024*13bits =13,312 bits =1,664 bytes Bias weight: Bias weight: 1 register for output layer (13 bit width)1 register for output layer (13 bit width) Sigmoid Memory: Sigmoid Memory: Values out of range [-4,4] are mapped to 0,1 Memory block quantizing sigmoid values : 11 bits input representing values [-4,4]11 bits input representing values [-4,4] 8 bits output representing values [0,1]8 bits output representing values [0,1] ~1.6 Kbyte 2 Kbyte
14
Simulation (1) Single Neuron VHDL simulation Single Neuron VHDL simulation Application: AND function with 4 inputs Application: AND function with 4 inputs Minimum decision value:0.3789Minimum decision value:0.3789 Maximum decision value:0.6211Maximum decision value:0.6211 3-Pipeline stages: 3-Pipeline stages: MemoriesMultAccumulator
15
Simulation (2) Result: Result: Sigmoid answer: 9F = 10011111 = 0.6211 Sigmoid answer: 9F = 10011111 = 0.6211 “ready” signal assigned when done “ready” signal assigned when done Latency: 14 + |Inputs| - 1 [clocks] Latency: 14 + |Inputs| - 1 [clocks]
16
Software PPC’s program controls the whole flow. PPC’s program controls the whole flow. The PPC writes control words and reads result words on PLB as 64 bits of data. The PPC writes control words and reads result words on PLB as 64 bits of data. Control/Result Words Structure: Control/Result Words Structure: Memories: Single Neuron: From CPUFrom CPU To CPUTo CPU [load_w0][rst][start][w0_ready][load_min_val][load_max_val][load_inputs_num][w0/min_val/max_val/inputs_number][ "0" ] ÷÷ [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ][ 6 ][ 7÷19 ][20÷63] [ y ][ready][w0_rd][ "0" ] ÷÷ [0 ÷ 1][ 2 ][ 3 ][4÷63] [ 0 ][ 1÷10 ][ 11÷24 / 11 ][ 25÷63 / 12÷63 ] [USER_wr_a][USER_addr_a][USER_dout_a][ “0” ] [ 0 ][ 1÷11 ][ 12÷19 ][ 20÷63 ] Sigmoid W/X
17
Building a Graphical User Interface for friendly communication between the user and the system. Building a Graphical User Interface for friendly communication between the user and the system. Implemented in Matlab 6.1 Implemented in Matlab 6.1 The GUI enables: The GUI enables: Choosing a function to be implemented Define maximum error, epochs number and decision values. Choosing the length of binary input vector. Simulating the neuron for input vector. Graphical User Interface
18
Project’s Part B Objectives Creating a multi layered network to classify a digit. Creating a multi layered network to classify a digit. Implementing a modular system : Implementing a modular system : Number of neurons in the hidden layer varies from 2 to 10. Number of sub-networks.
19
Project’s Part B Objectives (Cont.) Implementing a Parallel System: Implementing a Parallel System: Dividing complex fully-connected network into sub-networks. 10 sub-networks running concurrently. Up to 10 neurons run concurrently in each sub- network. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Parallel calculations of output layer.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.