20031 Janusz Starzyk, Yongtao Guo and Zhineng Zhu Ohio University, Athens, OH 45701, U.S.A. April 27 th, 2003.

Slides:



Advertisements
Similar presentations
What are FPGA Power Management HDL Coding Techniques Xilinx Training.
Advertisements

Enhanced matrix multiplication algorithm for FPGA Tamás Herendi, S. Roland Major UDT2012.
20031 Janusz Starzyk and Yongtao Guo School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. September,
Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.
Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure:  ALU-computing elements,  Programmable interconnections,  I/O components.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
BIST for Logic and Memory Resources in Virtex-4 FPGAs Sachin Dhingra, Daniel Milton, and Charles Stroud Electrical and Computer Engineering Auburn University.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
Hybrid Pipeline Structure for Self-Organizing Learning Array Yinyin Liu 1, Ding Mingwei 2, Janusz A. Starzyk 1, 1 School of Electrical Engineering & Computer.
Moving NN Triggers to Level-1 at LHC Rates Triggering Problem in HEP Adopted neural solutions Specifications for Level 1 Triggering Hardware Implementation.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
02/02/20091 Logic devices can be classified into two broad categories Fixed Programmable Programmable Logic Device Introduction Lecture Notes – Lab 2.
Fourth International Symposium on Neural Networks (ISNN) June 3-7, 2007, Nanjing, China A Hierarchical Self-organizing Associative Memory for Machine Learning.
Future Hardware Realization of Self-Organizing Learning Array and Its Software Simulation Adviser: Dr. Janusz Starzyk Student: Tsun-Ho Liu Ohio University.
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
Software Simulation of a Self-organizing Learning Array System Janusz Starzyk & Zhen Zhu School of EECS Ohio University.
Programmable logic and FPGA
1/31/20081 Logic devices can be classified into two broad categories Fixed Programmable Programmable Logic Device Introduction Lecture Notes – Lab 2.
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Project performed by: Naor Huri Idan Shmuel.
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
ICCINC' Janusz Starzyk, Yongtao Guo and Zhineng Zhu Ohio University, Athens, OH 45701, U.S.A. 6 th International Conference on Computational.
DESIGN OF A SELF- ORGANIZING LEARNING ARRAY SYSTEM Dr. Janusz Starzyk Tsun-Ho Liu Ohio University School of Electrical Engineering and Computer Science.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Dynamic Hardware Software Partitioning A First Approach Komal Kasat Nalini Kumar Gaurav Chitroda.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
CS-334: Computer Architecture
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
E0001 Computers in Engineering1 The System Unit & Memory.
Introduction to FPGA AVI SINGH. Prerequisites Digital Circuit Design - Logic Gates, FlipFlops, Counters, Mux-Demux Familiarity with a procedural programming.
FPGA IRRADIATION and TESTING PLANS (Update) Ray Mountain, Marina Artuso, Bin Gui Syracuse University OUTLINE: 1.Core 2.Peripheral 3.Testing Procedures.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Top Level View of Computer Function and Interconnection.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
J. Christiansen, CERN - EP/MIC
FPGA (Field Programmable Gate Array): CLBs, Slices, and LUTs Each configurable logic block (CLB) in Spartan-6 FPGAs consists of two slices, arranged side-by-side.
Array Synthesis in SystemC Hardware Compilation Authors: J. Ditmar and S. McKeever Oxford University Computing Laboratory, UK Conference: Field Programmable.
Lecture 16: Reconfigurable Computing Applications November 3, 2004 ECE 697F Reconfigurable Computing Lecture 16 Reconfigurable Computing Applications.
EEE440 Computer Architecture
A Configurable High-Throughput Linear Sorter System Jorge Ortiz Information and Telecommunication Technology Center 2335 Irving Hill Road Lawrence, KS.
EE3A1 Computer Hardware and Digital Design
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
Chapter 4 MARIE: An Introduction to a Simple Computer.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
CDA 4253 FPGA System Design The PicoBlaze Microcontroller
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
HPEC 2003 Linear Algebra Processor using FPGA Jeremy Johnson, Prawat Nagvajara, Chika Nwankpa Drexel University.
Backprojection Project Update January 2002
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Complex Programmable Logic Device (CPLD) Architecture and Its Applications
Lecture 15 PicoBlaze Overview
CPE/EE 428/528 VLSI Design II – Intro to Testing (Part 3)
Lecture 14 PicoBlaze Overview
Regular Expression Matching in Reconfigurable Hardware
Lecture 16 PicoBlaze Overview
The Xilinx Virtex Series FPGA
The Xilinx Virtex Series FPGA
Presentation transcript:

Janusz Starzyk, Yongtao Guo and Zhineng Zhu Ohio University, Athens, OH 45701, U.S.A. April 27 th, 2003

OUTLINE  INTRODUCTION  Biological Neural Networks  Traditional Hardware Implementation  BACKGROUND  Advantages & Algorithm  Self-Organizing Principle, Process and Simulation  Hardware Architecture  NEURON ARCHITECTURE  ROUTING  Scheme  Structure  SIMULATION  3D SOLAR  SUMMARY

Biological Neural Networks Biological Neural Networks Cell body From IFC’s webpage Dowling, 1998, p. 17 Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Traditional ANN Hardware –Quadratic relationship between the routing and the number of neurons makes traditional ANNs wire dominated –Limited hardware resource, especially routing resource limits the growth of VLSI NN chip complexity Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Self-Organizing Learning ARray (SOLAR )  Advantages: Self-organizing; Re-configurable; Expandable to multiple chips; Linear relationship between routing and number of neurons  Algorithm: –Entropy-based neural network learning algorithm Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Self-organizing Principle The information index and subspace information deficiencies are related as With subsequent divisions the accumulated information deficiency is expressed as a product of information deficiencies in each subsequent partition. Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Neuron’s Simulation Structure Neuron Inputs –System clock –Data input –Threshold control input (TCI) –Input information deficiency (ID) Other Neurons This neuron System clock Nearest neighbor neuron Remote neurons TCITCI IDID Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Cont’d Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Single Neuron’s Structure  Pico-Blaze 8-bit micro- controller from Xilinx  Use dual-port 256x16-bit memory as instruction storage  Dynamical reconfiguration  Full connections implementation mimics the dense connection scheme in the neighborhood neurons instruction in_port clk int reset address out_port port_id read_strobe write_strobe x 30 inputs Dual port memory R R Neural Controller (use KCPSM) address addr_bus data_bus clk enable instruction Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Neuron Array Structure Array neurons’ organization Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR  Identical neuron architecture  The programming contents can be dynamically updated via the configuration bus  3D expansion of these chips represents the sparse connections between remote neurons

Routing Structure Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR Resources of a single sub- network reused to implement multiple sub-networks sequentiallyResources of a single sub- network reused to implement multiple sub-networks sequentially Many resources are saved.Many resources are saved. Cost of additional processing time is not a problemCost of additional processing time is not a problem New dynamically reconfigurable routing and addressing scheme is proposedNew dynamically reconfigurable routing and addressing scheme is proposed

Cont’d Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR Two basic elements: the routing channel and the neuron cellTwo basic elements: the routing channel and the neuron cell Shift registerShift register Data addressing and clock cycle countingData addressing and clock cycle counting Connections information is stored in individual neuronsConnections information is stored in individual neurons Column reorderingColumn reordering

Routing Scheme Hardware reuse to implement 2 sub-networks. Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR Network system implemented by a single sub-network.Network system implemented by a single sub-network. Two groups of addresses: the first group of 4 addresses storing the number of clock cycles corresponding to the 4 inputs in the first sub-network, and the second group of 5 addresses storing the number of clock cycles corresponding to the 5 inputs in the second network.Two groups of addresses: the first group of 4 addresses storing the number of clock cycles corresponding to the 4 inputs in the first sub-network, and the second group of 5 addresses storing the number of clock cycles corresponding to the 5 inputs in the second network.

Routing Structure Routing channel with a micro controller Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR PicoBlazePicoBlaze CLB BasedCLB Based 8 parallel shift registers corresponding to PicoBlaze bus width8 parallel shift registers corresponding to PicoBlaze bus width

Self-organizing Process Matlab Simulation Initial interconnection Learning process Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Simulation Results Training Data SOLAR & other Algorithms Credit card data (ftp:cs.uci.edu) SOLAR & other Classifier (Model) MethodMiss Detection Probability MethodMiss Detection Probability CAL5.131Naivebay.151 DIPOL92.141CASTLE.148 Logdisc.141ALLOC SMART.158CART.145 C NewID.181 IndCART.152CN2.204 Bprop.154LVQ.197 RBF.145Quadisc.207 Baytree.171Default.440 ITule.137k-NN.181 AC2.181SOLAR.135 Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Behavioral VHDL Model All functions and signal variables in the packages are shared, and program execution is functionally interleaved. lower level package describes system input and output, and updates the network memory. The higher level packages encapsulate new system functions based on the functions of the lower level packages. The highest level design function implements the system organization and management. Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Software Model In Behavioural VHDL Hardware Model In Structural VHDL SW/HW VHDL Cosimulation A software process –Written in behavioral VHDL which is not synthesizable A hardware process –Written in synthesizable RTL VHDL HW/SW communication –FSM and FIFOs Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

The initial connections shown in solid lines. Every neuron adds two inputs For example neuron 3 adds input 2 and the output of neuron 1; neuron 5 adds the outputs of neuron 2 and 3, etc. After dynamical reconfiguration neuron 3 has inputs from the outputs of neuron 1 and 2 NN Example Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR and neuron 5 has inputs from the outputs of neuron 3 and 4 (shown by the dotted lines. ) The results read out from the chip via PCI bus are shown in the Matlab console “initial” values show primary input values (6 and 2) and neuron outputs for two rows of neurons “updated” values show inputs and neuron outputs after dynamical reconfiguration step

“Enable_bus” selects a neuron to be configured. Once the configuration process for all neurons is over, the outputs from neurons are ready to be read out. Updating any neuron’s configuration do not affect the other neurons. NN Example Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR In this example, we update neurons 3 and 4 connections represented by “Enable_bus” content 4 and 5. The simulation results after updating correspond to the real experiments read back from hardware as the “updated” values in previous slide.

Neurons Prototyping Problem: Memory usage for every neuron needs to be optimized to avoid bottleneck of resource utilization. Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

SW/HW codesign of SOLAR JTAG Programming Software run in PC PCI Bus Hardware Board Virtex XCV800FPGA dynamic configurationIntroduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR FPGAs make a scalable fabric FPGAs communicate and execute in parallel FPGAs operate at the bit and byte levels In 1990 FPGA had LUTs – –In 2003 FPGA has 100,000 4-LUTs Multiple Megabit memory

PCB Design Single SOLAR PCB contains 2x2 VIRTEX XCV1000 chips Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Two PCB Boards Interface Board SOLAR Board Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Expandable Parallel Local interconnect Power management Broadcast configuration Dynamic Self- Reconfiguration SOLAR Board Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

  3D SOLAR system, containing 5x5 SOLAR racks with close to 400 million gates   It will have a computing cells (neurons) working in parallel to process the incoming data   Can be used for other massively parallel computations (e.g. silicon wind tunnel) RACK and CUBE SOLAR Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

SOLAR Development Rack (4 boards,1x4) 1 Million gates 4 Million gates 16 Million gates System (25 cabinets, 5X5) Single Chip Solar Board +Interface Board (4 chips,2x2)+(2 chips) 400 million gates Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Conclusion SOLAR is different from traditional neural networks …  Learning and organization is based on local information  Hardware-oriented expandable parallel architecture  Dynamically reconfigurable hardware structure  Interconnection number grows linearly with the number of neurons  Data-driven self-organizing learning hardware Introduction Background NeuronArchitecture Routing Summary Simulation 3D SOLAR

Questions?