20031 Janusz Starzyk and Yongtao Guo School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. September,

Slides:



Advertisements
Similar presentations
System Integration and Performance
Advertisements

What are FPGA Power Management HDL Coding Techniques Xilinx Training.
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
CMSC 611: Advanced Computer Architecture
Sumitha Ajith Saicharan Bandarupalli Mahesh Borgaonkar.
Altera FLEX 10K technology in Real Time Application.
Template design only ©copyright 2008 Ohio UniversityMedia Production Spring Quarter  A hierarchical neural network structure for text learning.
FIU Chapter 7: Input/Output Jerome Crooks Panyawat Chiamprasert
Multiple V-model. Introduction In embedded systems, the test object is not just executable code. First a model of the system is built on a PC, which simulates.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Hybrid Pipeline Structure for Self-Organizing Learning Array Yinyin Liu 1, Ding Mingwei 2, Janusz A. Starzyk 1, 1 School of Electrical Engineering & Computer.
1 HW/SW Partitioning Embedded Systems Design. 2 Hardware/Software Codesign “Exploration of the system design space formed by combinations of hardware.
Aug. 24, 2007ELEC 5200/6200 Project1 Computer Design Project ELEC 5200/6200-Computer Architecture and Design Fall 2007 Vishwani D. Agrawal James J.Danaher.
02/02/20091 Logic devices can be classified into two broad categories Fixed Programmable Programmable Logic Device Introduction Lecture Notes – Lab 2.
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
Fourth International Symposium on Neural Networks (ISNN) June 3-7, 2007, Nanjing, China A Hierarchical Self-organizing Associative Memory for Machine Learning.
Future Hardware Realization of Self-Organizing Learning Array and Its Software Simulation Adviser: Dr. Janusz Starzyk Student: Tsun-Ho Liu Ohio University.
Define Embedded Systems Small (?) Application Specific Computer Systems.
1 FPGA Lab School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. An Entropy-based Learning Hardware Organization.
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
1/31/20081 Logic devices can be classified into two broad categories Fixed Programmable Programmable Logic Device Introduction Lecture Notes – Lab 2.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Project performed by: Naor Huri Idan Shmuel.
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Final Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Detector Array Controller Based on First Light First Light PICNIC Array Mux PICNIC Array Mux Image of ESO Messenger Front Page M.Meyer June 05 NGC High.
ECE Lecture 1 1 ECE 3561 Advanced Digital Design Department of Electrical and Computer Engineering The Ohio State University.
Fourth International Symposium on Neural Networks (ISNN) June 3-7, 2007, Nanjing, China Online Dynamic Value System for Machine Learning Haibo He, Stevens.
ICCINC' Janusz Starzyk, Yongtao Guo and Zhineng Zhu Ohio University, Athens, OH 45701, U.S.A. 6 th International Conference on Computational.
DESIGN OF A SELF- ORGANIZING LEARNING ARRAY SYSTEM Dr. Janusz Starzyk Tsun-Ho Liu Ohio University School of Electrical Engineering and Computer Science.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
Computer System Lifecycle Chapter 1. Introduction Computer System users, administrators, and designers are all interested in performance evaluation. Whether.
An Introduction Chapter Chapter 1 Introduction2 Computer Systems  Programmable machines  Hardware + Software (program) HardwareProgram.
An Introduction to Software Architecture
Infrastructure design & implementation of MIPS processors for students lab based on Bluespec HDL Students: Danny Hofshi, Shai Shachrur Supervisor: Mony.
VSIPL++ / FPGA Design Methodology
MICROPROCESSOR INPUT/OUTPUT
FPGA IRRADIATION and TESTING PLANS (Update) Ray Mountain, Marina Artuso, Bin Gui Syracuse University OUTLINE: 1.Core 2.Peripheral 3.Testing Procedures.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
GBT Interface Card for a Linux Computer Carson Teale 1.
System bus.
An Introduction to Digital Systems Simulation Paolo PRINETTO Politecnico di Torino (Italy) University of Illinois at Chicago, IL (USA)
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
J. Christiansen, CERN - EP/MIC
VHDL IE- CSE. What do you understand by VHDL??  VHDL stands for VHSIC (Very High Speed Integrated Circuits) Hardware Description Language.
Electrical and Computer Engineering University of Cyprus LAB 1: VHDL.
Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 11 High Desecration Language- Based Design.
Chapter 4 MARIE: An Introduction to a Simple Computer.
1 Extending FPGA Verification Through The PLI Charles Howard Senior Research Engineer Southwest Research Institute San Antonio, Texas (210)
1 Hardware/Software Co-Design Final Project Emulation on Distributed Simulation Co-Verification System 陳少傑 教授 R 黃鼎鈞 R 尤建智 R 林語亭.
1 Copyright  2001 Pao-Ann Hsiung SW HW Module Outline l Introduction l Unified HW/SW Representations l HW/SW Partitioning Techniques l Integrated HW/SW.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Computer Science 340 Software Design & Testing Software Architecture.
20031 Janusz Starzyk, Yongtao Guo and Zhineng Zhu Ohio University, Athens, OH 45701, U.S.A. April 27 th, 2003.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
George Mason University Finite State Machines Refresher ECE 545 Lecture 11.
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
Presenter: Darshika G. Perera Assistant Professor
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
IP – Based Design Methodology
Introduction to cosynthesis Rabi Mahapatra CSCE617
Lesson 4 Synchronous Design Architectures: Data Path and High-level Synthesis (part two) Sept EE37E Adv. Digital Electronics.
Chapter 13: I/O Systems.
Presentation transcript:

Janusz Starzyk and Yongtao Guo School of Electrical Engineering and Computer Science Ohio University, Athens, OH 45701, U.S.A. September, 2003

ONTLINE 1.Introduction 2.SOLAR Principle 3.Simulation Results 4.HW/SW Co-Simulation 5.Hardware Organization 6.Conclusion

Self Organizing Learning Array SOLAR New learning algorithm –Multi layer structure and on-line learning; –local and sparse interconnections; –entropy based self-organized learning Superior performance –Parallel computing organization; –Low power dissipation; –Efficient communication; –High chip utilization rate; Potential to be a leading technology in machine learning –pave the way to machine intelligence application areas including pattern recognition, intelligent control, signal processing, robotics and biological research.

DARPA: Cognitive Information Processing Technology  Wanted: machine that can reason, using substantial amounts of knowledge  Can learn from its experiences so that its performance improves with knowledge and experience  Can explain itself and can accept direction  Is aware of its own behavior and reflects on its own capabilities  Responds in a robust manner to a surprise

Self-Organizing Learning ARray (SOLAR ) Dowling, 1998, p. 17

Here,,, represent the probabilities of each class, attribute probability and joint probability respectively. Self-organizing Principle Neuron self-organization includes:  Selection of inputs  Choosing transformation function  Setting threshold  Providing output probabilities  Setting output control

Self-Organizing Process and Neuron Structure

Self-organizing Process Matlab Simulation Initial interconnection Learning process

Synthetic Data Classification

Credit Card Data Set Method Error Rate Cal SOLAR Itrule Discrim Logdisc DIPOL CART RBF CASTLE NaiveBay Backprop C SMART Baytree k-NN NewID LVQ ALLOC Quadisc Default Kohonen Failed SOLAR self organizing structure

SW/HW codesign of SOLAR JTAG Programming Software run in PC PCI Bus Hardware Board Virtex XCV800FPGA dynamic configuration

Cosimulation - What and Why? Cosimulation –Simulation of heterogeneous systems whose hardware and software components are interacting Benefits of cosimulation –Verifying correct functionality of the target even before hardware is built –Profiling the dynamic behavior –Identifying the performance bottleneck –Preventing problems such as over-design or under- design related to system integration –Saving the system development cost and cycle

Traditional Cosimulation Environment –A software process Written in high-level language, such as C/C++ –A simulation process of hardware model Hardware description language, such as VHDL –Inter-process communication (IPC) routine Connect the hardware process and software process Software Model (C-program) Hardware Model (VHDL) IPC routines Foreign IPC procedures IPC Two simulators

Traditional Cosimulation  To perform cosimulation, two simulators should be combined and complex IPC should be developed. These IPCs are error-prone routines requiring to handle various formats of data and processed signals  Especially, when focusing on hardware part, we hope that the software part is minimized and the HW/SW communication is simple and reliable

SOLAR Cosimulation –A software process Written in behavioral VHDL which is not synthesizable –A hardware process Written in RTL VHDL which is synthesizable –HW/SW communication FSM and FIFOs Software Model (Behavioral VHDL) Hardware Model (RTL VHDL) One simulator FSM and FIFOs

SOLAR Cosimulation  To perform SOLAR cosimulation, one single VHDL simulator is applied. So complex error- prone IPC is avoided. Data formats and other problems can be easily handled.  The interface between HW/SW is implemented by several FIFOs controlled by a FSM, which is simple, reliable and easily modified.  File I/O functions are used to simplify software part design when focusing on hardware part implementation.

Co-simulation System Decomposition Interface modeling (RTL VHDL Main Initialization File I/O SOLAR Training Over No Yes System architecture modelling (Behavioral VHDL) Input FIFO Output FIFO FSMFSM Interface Control OP EBE REG FIFO MEM Self-organizing learning architecture (Structural VHDL)

SW Organization VHDL Model All functions and signal variables in the packages are shared, and program execution is functionally interleaved. lower level package is the description for system input and output, initialization and update of the memory element in the network. The higher level packages encapsulate new system functions based on the functions described by the lower level packages. The highest design level function representing the software part in the overall system implements the system organization and management.

Single Neuron’s Hardware Architecture Figure 4: Single neuron’s learning architecture D REG CTRL R R R R FIFO/DMA CTRL MAIN CONTROLLER OP 1024X32 FIFO INTERFACEINTERFACE INTERFACEINTERFACE M ALU M

Interface Process SW HW time configuration send data Receive data conf done start wait command send command over read registers dma request … … time

Interface Modeling class other Software (behavioral VHDL) Interface FIFOs memory module Ctrl Others Figure 5: Interface modeling using FSM&FIFO Hardware (structural VHDL) training

Interface Simulation Small Training Data Set

System Synchronized Work Software Work Hardware Work Interface Work Time

Incremental Prototyping Overall system design can be accelerated by replacing HW subcomponent with real hardware once successfully simulated. HW function is completely defined and prototyped t HW function VHDL- simulated (incremental part)

EBE Simulation Main Procedures contain:  Sending data from software to Chip Memory  Trigger start signal  ALU calculation for all data  Moving calculated results to intermediate memory  Threshold scanning & ID calculation  Updating the intermediate values  Data Movement if the current ID is optimal  Repeating from 3 to 6 untill all functions are scanned  Sending data from Chip to software In this simulation waveform, the signal “Opt_Threshold” and “ID” represent the optimal threshold and the corresponding information index deficiency for this particular training neuron in its learning subspace.

EBE Prototyping SOLAR Training SOLAR Training Map onto Virtex (57.8% logic, 60.3% route) Minimum period: ns (Maximum Frequency: MHz) Minimum input arrival time before clock: ns Maximum output required time after clock: ns

For instance, a particular neuron has 1024 subspace data. PC to Chip: 38x1024 = CLKs ALU calculation: 16x1024=16384 CLKs Threshold scan & ID calculation (maximum): (4x1024+7) x1024= CLKs Data Movement (Maximum) 1x1024=1024 CLKs Chip to PC: 1x1024=1024 CLKs Other: (starting sequence, wait, handshaking, etc.) 20x1024 =20480 CLKs Total: ( )x = CLKs Run Time Main Operations CLK Number per DATA PC data to in- chip memory 38 ALU Calculation 16 Threshold Scanning 4 ID calculation7 Memory data Movement 1 In-chip FIFO to PC 1 x7 functions

Prototyping Board

Future Work - System SOLAR

SOLAR will grow Rack (4 boards,1x4) 1 Million gates 6 Million gates 24 Million gates Half of a billion gates Board (6 chips,2x3)System (16 cabinets, 4X4)Chip VIRTEXCV1000

Questions