Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets 309326767 Maxim Zavodchik 310623772.

Slides:



Advertisements
Similar presentations
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Advertisements

Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
Internal Logic Analyzer Final presentation-part B
Internal Logic Analyzer Final presentation-part A
Nonlinear & Neural Networks LAB. CHAPTER 20 VHDL FOR DIGITAL SYSYEM DESIGN 20.1VHDL Code for a Serial Adder 20.2VHDL Code for a Binary Multiplier 20.3VHDL.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
MEMOCODE 2007 HW/SW Co-design Contest Documentation of the submission by Eric Simpson Pengyuan Yu Sumit Ahuja Sandeep Shukla Patrick Schaumont Electrical.
Handwritten Character Recognition Using Artificial Neural Networks Shimie Atkins & Daniel Marco Supervisor: Johanan Erez Technion - Israel Institute of.
Super Fast Camera System Performed by: Tokman Niv Levenbroun Guy Supervised by: Leonid Boudniak.
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Spring 2004 Spring 2004 Virtex II-Pro Dynamical Test Application Part.
Moving NN Triggers to Level-1 at LHC Rates Triggering Problem in HEP Adopted neural solutions Specifications for Level 1 Triggering Hardware Implementation.
1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester.
© 2004 Xilinx, Inc. All Rights Reserved Implemented by : Alon Ben Shalom Yoni Landau Project supervised by: Mony Orbach High speed digital systems laboratory.
Characterization Presentation Spring 2006 Implementation of generic interface To electronic components via USB2 Connection Supervisor Daniel Alkalay System.
טכניון – מכון טכנולוגי לישראל הפקולטה להנדסת חשמל Final A Presentation Students: Nir Sheffi Evgeny Bogokovsky Instructor: Isaschar Walter Winter 2004.
Presenting: Itai Avron Supervisor: Chen Koren Final Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.
Performed by: Ariel Wolf & Elad Bichman Instructor: Yuri Dolgin המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory הטכניון - מכון טכנולוגי.
Configurable System-on-Chip: Xilinx EDK
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
Programmable logic and FPGA
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Project performed by: Naor Huri Idan Shmuel.
Presenting: Itai Avron Supervisor: Chen Koren Characterization Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.
Technion Digital Lab Project Xilinx ML310 board based on VirtexII-PRO programmable device Students: Tsimerman Igor Firdman Leonid Firdman Leonid.
Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
1 Performed By: Khaskin Luba Einhorn Raziel Einhorn Raziel Instructor: Rivkin Ina Winter 2005 Winter 2005 Virtex II-Pro Dynamical Test Application - Part.
Presenting: Itai Avron Supervisor: Chen Koren Mid Semester Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:
Technion Digital Lab Project Performance evaluation of Virtex-II-Pro embedded solution of Xilinx Students: Tsimerman Igor Firdman Leonid Firdman.
Implementation of DSP Algorithm on SoC. Characterization presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompany engineer : Emilia Burlak.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
Final presentation Encryption/Decryption on embedded system Supervisor: Ina Rivkin students: Chen Ponchek Liel Shoshan Winter 2013 Part A.
Sept EE24C Digital Electronics Project Design of a Digital Alarm Clock.
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
General Purpose FIFO on Virtex-6 FPGA ML605 board Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf 1 Semester: spring 2012.
Student : Andrey Kuyel Supervised by Mony Orbach Spring 2011 Final Presentation High speed digital systems laboratory High-Throughput FFT Technion - Israel.
Presented by : Maya Oren & Chen Feigin Supervisor : Moshe Porian Lab: High Speed Digital System One Semester project – Spring
Matrix Multiplication on FPGA Final presentation One semester – winter 2014/15 By : Dana Abergel and Alex Fonariov Supervisor : Mony Orbach High Speed.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
SHA-3 Candidate Evaluation 1. FPGA Benchmarking - Phase Round-2 SHA-3 Candidates implemented by 33 graduate students following the same design.
J. Christiansen, CERN - EP/MIC
Sequential Arithmetic ELEC 311 Digital Logic and Circuits Dr. Ron Hayne Images Courtesy of Cengage Learning.
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
VHDL Project Specification Naser Mohammadzadeh. Schedule  due date: Tir 18 th 2.
GRECO - CIn - UFPE1 A Reconfigurable Architecture for Multi-context Application Remy Eskinazi Sant´Anna Federal University of Pernambuco – UFPE GRECO.
Picture Manipulation using Hardware Presents by- Uri Tsipin & Ran Mizrahi Supervisor– Moshe Porian Final Presentation – Part B Dual-semester project
PROCStar III Performance Charactarization Instructor : Ina Rivkin Performed by: Idan Steinberg Evgeni Riaboy Semestrial Project Winter 2010.
1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital.
1 הפקולטה להנדסת חשמל הפקולטה להנדסת חשמל Department of Electrical Engineering הטכניון - מכון טכנולוגי לישראל Technion - Israel Institute of Technology.
Final Presentation Final Presentation OFDM implementation and performance test Performed by: Tomer Ben Oz Ariel Shleifer Guided by: Mony Orbach Duration:
PROJECT - ZYNQ Yakir Peretz Idan Homri Semester - winter 2014 Duration - one semester.
Final Presentation Implementation of DSP Algorithm on SoC Student : Einat Tevel Supervisor : Isaschar Walter Accompanying engineer : Emilia Burlak The.
Tools - LogiBLOX - Chapter 5 slide 1 FPGA Tools Course The LogiBLOX GUI and the Core Generator LogiBLOX L BX.
A Programmable Single Chip Digital Signal Processing Engine MAPLD 2005 Paul Chiang, MathStar Inc. Pius Ng, Apache Design Solutions.
Final Presentation Encryption on Embedded System Supervisor: Ina Rivkin students: Chen Ponchek Liel Shoshan Spring 2014 Part B.
Part A Final Dor Obstbaum Kami Elbaz Advisor: Moshe Porian August 2012 FPGA S ETTING U SING F LASH.
Generic SOC Architecture for Convolutional Neural Networks CDR By: Merav Natanson & Yotam Platner Supervisor: Guy Revach HSDSL Lab, Technion.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Internal Logic Analyzer Characterization presentation By: Moran Katz and Zvika Pery Mentor: Moshe Porian Dual-semester project Spring 2012.
Implementing JPEG Encoder for FPGA ECE 734 PROJECT Deepak Agarwal.
Digital Logic & Design Dr.Waseem Ikram Lecture No. 43.
Parallel compressing system for satellite on programmable chip Yifat Manzor Yifat Manzor & Reshef Dahan Supervisor: Eran Segev Part A.
Internal Logic Analyzer Middle presentation-part A By: Moran Katz and Zvika Pery Mentor: Moshe Porian Dual-semester project Spring 2012.
Backprojection Project Update January 2002
Introduction to Programmable Logic
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs Shuo Wang1, Zhe Li2, Caiwen Ding2, Bo Yuan3, Qinru Qiu2, Yanzhi Wang2,
Final Project presentation
Electronics for Physicists
♪ Embedded System Design: Synthesizing Music Using Programmable Logic
Presentation transcript:

Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik

Project Objectives Implementing Neural Network on FPGA Implementing Neural Network on FPGA  Creating modular design  Implementing in software (Matlab)  Creating PC Interface Performance Analyze: Performance Analyze:  Area on chip  Interconnections  Speed vs. software implementation  Frequency  Cost

System modulation Input image size is up to 1024 binary pixels. Input image size is up to 1024 binary pixels. Number of neurons in the hidden layer varies from 2 to 10. Number of neurons in the hidden layer varies from 2 to 10. Number of sub-networks varies Number of sub-networks varies up to 31 any objects. up to 31 any objects. Our Application:  Input length 64 pixels (8x8)  Objects to classify are digits.

System Parallelizing Dividing complex fully-connected network into sub-networks. Dividing complex fully-connected network into sub-networks. 10 sub-networks running concurrently. 10 sub-networks running concurrently. Up to 10 neurons run concurrently in each sub- network. Up to 10 neurons run concurrently in each sub- network. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Up to 5 inputs are calculated together depending on number of neurons in hidden layer. Parallel calculations of output layer Parallel calculations of output layer

System Interface Inputs Inputs  Binary image ( up to 1024 pixels )  Weights – 13 bits width  Sigmoid function values – 10 bit width Outputs Outputs  Vector size of 5 bits coding the recognized object or indicating a failure.

Hardware Description XilinX ML310 Development Board XilinX ML310 Development Board  RS232 Standard - FPGA UART  Transmission rate is 115,200 bits/sec optimally  VirtexII-Pro XC2VP30 FPGA  2 PowerPC 405 Core MHz  2,448 Kbytes of BRAM memories  x18 bits multipliers  30,816 Logic Cells  Up to 111,232 internal registers  Up to 111,232 LUTS  256 MB DDR DIMM

System Description Power PC Weights memory Net blocks UART Input memory Sigmoid memory

Control Flow Get Sigmoid Get Weights Get NN Description Get Input Image Load Biases Calculate (Hidden Layer) φ(.) Calculate φ(.) (Hidden Layer) Calculate maximal output and generate Output vector φ(.) Calculate φ(.) (Output Layer) Have all sub-networks done? Send the result to user IDLE Calculate (Output Layer) Yes No

Architecture – Complete Network  10 Net-Blocks working concurrently  Each Net-Block calculates 1 sub-networks.  Parallel input into each block from Input Memories.  Maximum Unit - producing output vector coding the recognized digit. Net block 1 Max Net block 10 8 bits X_vector W_vector 5 bit 5 bits X_vector-5 pixels W_vector- 5pixels

Architecture – The Net Block 10 Hardcore Multipliers and 10 Accumulators  10 Hardcore Multipliers and 1 Adder  1 Sigmoid Memory mult1 mult2 mult10 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder φ(.)

Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 10 neurons in the hidden layer (in parallel: 10 neurons getting input) φ(.) W1_1 W1_2 W1_10 W’10 W’2 W’1

Architecture – The Net Block mult1 mult2 mult10 x1 acc1 acc2 acc10Mult’10 Mult’2 Mult’1 adder 2 neurons in the hidden layer (in parallel: 2 neurons getting input, 5 pixels) x5 φ(.) W1_1 W5_2 W5_1 W1_2 W’1 W’2 “0”

Architecture - Memories Inputs Memories: Inputs Memories:  Up to 1024 pixels : Array of 5 banks each 205 bitArray of 5 banks each 205 bit Weights Memories: Weights Memories:  Hidden Layer: 10 memory blocks for every net-block Arrayed in 10 banks each 1024*13 = 1,664 bytesArrayed in 10 banks each 1024*13 = 1,664 bytes  Output Layer: Array of 10 registers each 13 bit width Array of 10 registers each 13 bit width  Bias weights: 10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width10 blocks of : 10 registers for hidden layer + 1 register for output layer each 13 bit width Sigmoid Memory: Sigmoid Memory:  10 memory blocks quantizing sigmoid values each: 13 bits input representing values [-16,16]13 bits input representing values [-16,16] 10 bits output representing values [0,1]10 bits output representing values [0,1] 1 Kbyte 100 Kbytes ~167 Kbytes

Schedule Date:Assignment 27.12Midterm presentation Continue planning the FSM Controller Implementing the NN in VHDL Integration of the NN, PowerPC and Memories Simulating the NN Synthesis Analyzing performance Writing project book 21.02Final presentation