Hardware implement 涂正中. 2010:Convolutional Networks and Applications in Vision ( Overview ) Chip: first :1991-1992, Bell Labs’ ANNA chip. ConvNet chip.

Slides:



Advertisements
Similar presentations
TU/e Processor Design 5Z0321 Processor Design 5Z032 Computer Systems Overview Chapter 1 Henk Corporaal Eindhoven University of Technology 2011.
Advertisements

A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
Smart EQ Digital Stereo Equalizer Dustin Demontigny David Bull.
CMOL overview ● CMOS / nanowire / MOLecular hybrids ● Uses combination of Micro – Nano – Nano implements regular blocks (ie memory) – CMOS used for logic,
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
1 Effect of Increasing Chip Density on the Evolution of Computer Architectures R. Nair IBM Journal of Research and Development Volume 46 Number 2/3 March/May.
Development and Implementation of a FPGA-based digital beamformer Supervisors: Nandita Bhattacharjee Dr. Andrew Paplinski Dr. Andrew Paplinski Dale Harders.
Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Introduction to FPGA and DSPs Joe College, Chris Doyle, Ann Marie Rynning.
Spectrum Analyzer Ray Mathes, Nirav Patel,
Eye-RIS. Vision System sense – process - control autonomous mode Program stora.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
1 DSP Implementation on FPGA Ahmed Elhossini ENGG*6090 : Reconfigurable Computing Systems Winter 2006.
CS2100 Computer Organisation Introduction (AY2015/6 Semester 1)
Sub-Nyquist Sampling DSP & SCD Modules Presented by: Omer Kiselov, Daniel Primor Supervised by: Ina Rivkin, Moshe Mishali Winter 2010High Speed Digital.
 Chasis / System cabinet  A plastic enclosure that contains most of the components of a computer (usually excluding the display, keyboard and mouse)
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
How to Choose Frame Grabber …that’s right for your application Coreco Imaging.
Matt Waldersen T.J. Strzelecki Rick Schuman Krishna Jharjaria.
Paper Review: XiSystem - A Reconfigurable Processor and System
10/10/20151 DIF – Digital Imaging Fast Ali Nuhi and Everett Salley EEL4924 Senior Design Date: 02 March 2011.
A bit-streaming, pipelined multiuser detector for wireless communications Sridhar Rajagopal and Joseph R. Cavallaro Rice University
VLSI & ECAD LAB Introduction.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
By Anthony Moody Rhys Porter Nicolas Wilson.  Background  Objectives  Design  Normal Operation  Logic  Failure  Issues and solutions  Improvements.
Lecture No. 1 Computer Logic Design. About the Course Title: –Computer Logic Design Pre-requisites: –None Required for future courses: –Computer Organization.
Implementation of Infomax ICA Algorithm with Low-Power Analog CMOS Circuits Ki-Seok Cho and Soo-Young Lee Brain Science Research Center and Department.
ELEC692/04 course_des 1 ELEC 692 Special Topic VLSI Signal Processing Architecture Fall 2004 Chi-ying Tsui Department of Electrical and Electronic Engineering.
R2D2 team R2D2 team Reconfigurable and Retargetable Digital Devices  Application domains Mobile telecommunications  WCDMA/UMTS (Wideband Code Division.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
Background: VLSI Courses at Lafayette  ECE VLSI Circuit Design  Original form: “tall thin designer”  VLSI Processing  CMOS Transistor Characteristics.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Design Constraint Analysis Team KANG Group 1. Sentry Gun Design and build a turret and armature structure with the ability to detect, track and fire upon.
EML 2023 – Motor Control Lecture 1 – Closed Loop Motor Control.
Prof. Dr. Martin Brooke Bortecene Terlemez
Jason Li Jeremy Fowers 1. Speedups and Energy Reductions From Mapping DSP Applications on an Embedded Reconfigurable System Michalis D. Galanis, Gregory.
Sean Mathews, Christopher Kiser, Haoxiang Chen. Processor Design Tradeoffs: Instruction Set Design Support useful functions while implementing as efficiently.
Connecting EPICS with Easily Reconfigurable I/O Hardware EPICS Collaboration Meeting Fall 2011.
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
Introduction to Field Programmable Gate Arrays (FPGAs) EDL Spring 2016 Johns Hopkins University Electrical and Computer Engineering March 2, 2016.
Microprocessor Design Process
Implementation of Real Time Image Processing System with FPGA and DSP Presented by M V Ganeswara Rao Co- author Dr. P Rajesh Kumar Co- author Dr. A Mallikarjuna.
K-Nearest Neighbor Digit Recognition ApplicationDomainConstraintsKernels/Algorithms Voice Removal and Pitch ShiftingAudio ProcessingLatency (Real-Time)FFT,
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
An Overview CS341 Digital Logic and Computer Organization F2003.
SMART CAMERAS AS EMBEDDED SYSTEM SMVEC. SMART CAMERA  See, think and act  Intelligent cameras  Embedding of image processing algorithms  Can be networked.
Physical Memory and Physical Addressing ( Chapter 10 ) by Polina Zapreyeva.
Computer Hardware – System Unit
Processor/Memory Chapter 3
System On Chip.
ECE 154A Introduction to Computer Architecture
Overcoming Resource Underutilization in Spatial CNN Accelerators
Session III Architecture of PLC
Introduction to Digital Signal Processors (DSPs)
Convolutional Neural Networks
Text Book Computer Organization and Architecture: Designing for Performance, 7th Ed., 2006, William Stallings, Prentice-Hall International, Inc.
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Sridhar Rajagopal and Joseph R. Cavallaro Rice University
Ghifar Parahyangan Catholic University August 22, 2011
Demonstration of STDP based Neural Networks on an FPGA
Chapter 1 Introduction.
Final Project presentation
Lecture: Deep Convolutional Neural Networks
Deep Neural Networks for Onboard Intelligence
1.Introduction to Advanced Digital Design (14 marks)
Chapter 0 Introduction Introduction Chapter 0.
Multiprocessor System Interconnects
Presentation transcript:

Hardware implement 涂正中

2010:Convolutional Networks and Applications in Vision ( Overview ) Chip: first : , Bell Labs’ ANNA chip. ConvNet chip from Canon team 2009:CAVIAR ; Addressed-Event Representation (AER) convolvers FPGA: Vip:An fpga-based processor for image processing and neural networks(1996) CNP:(2009)

ANNA chip(analog neural network arithmetic and logic unit)(1992) in a 0.9 um CMOS technology contains transistors on a 4.5 x 7 mm2 die Although the chip uses analog computing internally, all input/output is digital. (mixed analog/digital) combines the advantages of high synaptic density, high speed, low power of analog, and easy interfacing to a digital system such as a digital signal processor (DSP).

ANNA chip Architechture

ANNA chip application For Handwritten digits implement Error rate :about 5% maximum of 10 9 connections per second (10 GC/s), actually much lower

CAVIAR (Convolution AER Vision Architecture for Real-Time) (Addressed-Event Representation,AER) 卷积器不需 要乘法器计算卷积 4 mixed-signal AER chips 论文太长太难(主要用 AER ,也不好找资料)

CAVIAR

CNP(ConvNet Processor)(2009) A single FPGA with an external memory module

CNP Architechture

2D Conv Parallel pipelined 逐行扫描

APP to face detection (Training Architecture)

Network Architecture Built and trained on a computer using the Lush language,and compiled to the CNP using the automatic ConvNet compiler. Trained on images, for testing. Roughly 3% equal error rate after 5 epochs’ training.

Speed 4 billion connections per second on average Time spent on pre/post processing and data fetcing Process 512 X 384 greyscale image(LeNet) frames per second

2010/Hardware Accelerated Convolutional Neural Networks

2010

2013_ The Multi 2D Systolic Design

Architecture

Reference ISCAS2010_Convolutional_Networks_and_Applications_in_Vision IEEE1992_Application of the ANNA Neural Network IEEE2009_CAVIAR_Canon corparationCAVIAR 2009_CNP_AN_FPGA_BASED_PROCESSOR_FOR_CONVOLUTIONAL_NET WORKS ISCAS2010_Hardware_Accelerated_Convolutional_Neural ICECS2013The_Multi_2D_Systolic_Design_and_Implementation_of_Con volutional_Neural_Networks