Frank Vahid, UCR 1 Building Fake Body Parts: Digital Mockups Frank Vahid Univ. of California, Riverside Support provided by NSF, SRC, and CareFusion.

Slides:



Advertisements
Similar presentations
Digitally-Bypassed Transducers: Interfacing Digital Mockups to Real-Time Medical Equipment Scott Sirowy*, Tony Givargis and Frank Vahid* This work was.
Advertisements

PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst.
Conjoining Soft-Core FPGA Processors David Sheldon a, Rakesh Kumar b, Frank Vahid a*, Dean Tullsen b, Roman Lysecky c a Department of Computer Science.
High Performance Embedded Computing © 2007 Elsevier Chapter 7, part 1: Hardware/Software Co-Design High Performance Embedded Computing Wayne Wolf.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Application-Specific Customization of Parameterized FPGA Soft-Core Processors David Sheldon a, Rakesh Kumar b, Roman Lysecky c, Frank Vahid a*, Dean Tullsen.
A Configurable Logic Architecture for Dynamic Hardware/Software Partitioning Roman Lysecky, Frank Vahid* Department of Computer Science and Engineering.
The New Software: Invisible Ubiquitous FPGAs that Enable Next-Generation Embedded Systems Frank Vahid Professor Department of Computer Science and Engineering.
Warp Processing – Towards FPGA Ubiquity Frank Vahid Professor Department of Computer Science and Engineering University of California, Riverside Associate.
Application-Specific Codesign Platform Generation for Digital Mockups in Cyber- Physical Systems Bailey Miller *, Frank Vahid *†, Tony Givargis † *Dept.
1/21 Scalable Object Detection Accelerators on FPGAs Using Custom Design Space Exploration Chen Huang and Frank Vahid Dept. of Computer Science and Engineering.
Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.
Trend towards Embedded Multiprocessors Popular Examples –Network processors (Intel, Motorola, etc.) –Graphics (NVIDIA) –Gaming (IBM, Sony, and Toshiba)
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Semester One 2001/2002 Sheffield Hallam University1 The Motherboard Major circuit board in PC Holds CPU where calculations and instructions on data are.
Getting Started With DSP A. What is DSP? B. Which TI DSP do I use? Highest performance C6000 Most power efficient C5000 Control optimized C2000 TMS320C6000™
Frank Vahid, 1 Embedding-Based Placement of Processing element Networks on FPGAs for Physical Model Simulation Bailey Miller*, Frank Vahid*, Tony Givargis**
Frank Vahid, UCR 1 Building Fake Body Parts: Digital Mockups Frank Vahid Univ. of California, Riverside Support provided by NSF, SRC, Dept. of Educ. Also.
The Internal Components of a Personal Computer (PC)
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
GPU-accelerated Evaluation Platform for High Fidelity Networking Modeling 11 December 2007 Alex Donkers Joost Schutte.
An Overview of Using Computers
A Fast On-Chip Profiler Memory Roman Lysecky, Susan Cotterell, Frank Vahid* Department of Computer Science and Engineering University of California, Riverside.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
CAD Techniques for IP-Based and System-On-Chip Designs Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
Lesson 2 — How Does A Computer Process Data?
Embedded Supercomputing in FPGAs
Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
Operating Systems for Reconfigurable Systems John Huisman ID:
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
New Strategies for System Level Design Daniel Gajski Center for Embedded Computer Systems (CECS) University of California, Irvine
Microcontroller Presented by Hasnain Heickal (07), Sabbir Ahmed(08) and Zakia Afroze Abedin(19)
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Programmable Logic Devices
1 Abstract & Main Goal המעבדה למערכות ספרתיות מהירות High speed digital systems laboratory The focus of this project was the creation of an analyzing device.
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
High Performance Embedded Computing © 2007 Elsevier Lecture 18: Hardware/Software Codesign Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
EE3A1 Computer Hardware and Digital Design
Algorithm and Programming Considerations for Embedded Reconfigurable Computers Russell Duren, Associate Professor Engineering And Computer Science Baylor.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
FPL Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs.
Jason Li Jeremy Fowers 1. Speedups and Energy Reductions From Mapping DSP Applications on an Embedded Reconfigurable System Michalis D. Galanis, Gregory.
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,
How are they called?.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Chapter 5A Transforming Data Into Information.
Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank.
Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.
Exploiting Parallelism
WARP PROCESSORS ROMAN LYSECKY GREG STITT FRANK VAHID Presented by: Xin Guan Mar. 17, 2010.
A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.
Scott Sirowy, Chen Huang, and Frank Vahid † Department of Computer Science and Engineering University of California, Riverside {ssirowy,chuang,
On-Chip Logic Minimization Roman Lysecky & Frank Vahid* Department of Computer Science and Engineering University of California, Riverside *Also with the.
System-on-Chip Design Homework Solutions
Warp Processing: Making FPGAs Ubiquitous via Invisible Synthesis Greg Stitt Department of Electrical and Computer Engineering University of Florida.
Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.
1 A simple parallel algorithm Adding n numbers in parallel.
CoDeveloper Overview Updated February 19, Introducing CoDeveloper™  Targeting hardware/software programmable platforms  Target platforms feature.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Unit 2 Technology Systems
Chapter 7.2 Computer Architecture
Introduction to Reconfigurable Computing
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Portable SystemC-on-a-Chip
Presentation transcript:

Frank Vahid, UCR 1 Building Fake Body Parts: Digital Mockups Frank Vahid Univ. of California, Riverside Support provided by NSF, SRC, and CareFusion

Frank Vahid, UCR 2 Building fake body parts How test medical equipment software?

Frank Vahid, UCR 3 Simulation: Slow/Inaccurate Accurate simulation is slow 2-3 minutes to simulate one breath accurately Decrease accuracy for real-time Weibel lung complexity 4 gen: 32 ODEs 6 gen: 128 ODEs 8 gen: 512 ODEs 10 gen: 2048 ODEs

Frank Vahid, UCR 4 Mockups Physical mockup Processing Core Transducers Device Digital communication Physical phenomena Physical phenomena disconnected Processing Core Transducers Device Transducer models Environment Model Intercepted transducer packets Digital Mockup How run in real-time? be.com/watch?f eature=player_e mbedded&v=rb0 ik1HopBk

Frank Vahid, UCR 5 Physical models are inherently parallel V[1],F[1] V[2],F[2] V[7],F[7] ODE dependency graph

Frank Vahid, UCR 6 GPUs Tried, failed –GPU research group also –(results later)

Frank Vahid, UCR 7 for (i=0; i < 128; i++) y[i] += c[i] * x[i].. FPGAs: Sw circuits (parallel) for (i=0; i < 128; i++) y += c[i] * x[i].. ************ C Code for FIR Filter Processor 1000’s of instructions –Several thousand cycles Circuit for FIR Filter Processor FPGA ~ 7 cycles (though slower clock) Speedup > 10x-100x

Frank Vahid, UCR 8 2x2 switch matrix y z w x FPGAs “101” (A Quick Intro) ab a1a0a1a0 4x2 Memory abab d 1 d 0 F G LUT FG ab SM ab c D E FPGA abc D E

Frank Vahid, UCR 9 Differential Equation Processing Element General PE Diffeq can't be solved exactly Use iterative approximation (Euler, RK4) Computes equation solutions at given timestep (e.g. 0.1 ms timesteps). Huang, Vahid, Givargis. A Custom FPGA Processor for Physical Model Ordinary Differential Equation Solving. Embedded Systems Letters, Dec, FPGA Digital mockup Interface DEPE Device under test

Frank Vahid, UCR 10 Single DEPE CPU(1),(4): Pentium IV, 3.0 GHz DEPE: Xilinx Virtex6-240T Microblaze: LUTs.

Frank Vahid, UCR 11 Homogeneous network of general PEs Map ODEs to homogeneous PE network ODE dependency graph Scheduling V[1],F[1] V[2],F[2] V[7],F[7] ODE dependency graph Huang, Vahid, Givargis Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012) FPGA Digital mockup Interface PE3 PE1 PE2 100s of PEs Synthesis tool PE1 PE2 PE3

Frank Vahid, UCR 12 Homogeneous network of general PEs FPGA Digital mockup

Frank Vahid, UCR 13 Homogeneous network of general PEs ODE mapping via simulated annealing 10K iterations 150K iterations

Frank Vahid, UCR 14 Homogeneous network of general PEs

Frank Vahid, UCR 15 Homogeneous network of general PEs – FPGA Usage 150KLuts available on Virtex6-240T utube.com/wa tch?v=ThUKV hqoA3Q Demo

Frank Vahid, UCR 16 Custom Processing Element Custom PE Custom datapath to solve specific type of equation Huang, Vahid, Givargis Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012) MUL Const ROM Address Input_sel Address Inputs Output SUB Controller We Data RAM Controller PE SUBMUL FPGA Digital mockup Interface V’ = F 1 – F 2 F’ = P 1 -P 2 -(F*C R )*C L Custom PE for each ODE type

Frank Vahid, UCR 17 Custom Processing Element

Frank Vahid, UCR 18 Custom Processing Element – FPGA Usage

Frank Vahid, UCR 19 Networks of Heterogeneous Processing Elements Huang, Miller, Vahid, Givargis. Synthesis of Heterogeneous Processing Elements for Physical System Emulation. CODES+ISSS 2012, Oct, General PE: –Slow, flexible (can solve any types of ODEs) Custom PE: –Fast, Inflexible (only solves one type of ODEs) Multi-Type PE –Combined multiple types of ODEs into single custom PE FPGA Digital mockup Interface Huge solution space: How to choose types of PEs? How many PEs to allocate? How to bind ODEs to PEs?

Frank Vahid, UCR 20 Automatic allocation and binding Initial random allocation PE allocator ODE-to-PE mapper New PE allocation Cycles of each PE Better solution Best solution N Y Simulated Annealing

Frank Vahid, UCR 21 Networks of Heterogeneous Processing Elements

Frank Vahid, UCR 22 Heterogeneous Networks – FPGA Usage

Frank Vahid, UCR 23 Network of PEs VS GPU and PC Speedup vs real-time PC(1): 0.76x PC(4): 3.07x GPU: 1.63x HLS: 3.23x General PE: 4.94x Custom PE: 6.1x Hetero PE: 34.5x

Frank Vahid, UCR 24 Network of general/custom/heterogeneous PEs VS HLS (regularity extraction) Heterogeneous PE: (10x, 1.1x) HLS (7x, 0.85x) general PE (6x, 1.35x) custom PE (Speed, Size) Performance (ms): time to emulate 1000 ms, using Euler with 0.01 ms step. Size (equivalent LUTs)

Frank Vahid, UCR 25 Speedup / dollar CPU (I Intel X58 board): $480 GPU(GTX460 + I H55 board): $380 FPGA (Xilinx Virtex6 240T-2 board): $1800 Heterogeneous PEs: 3X better than PC(4) 4.5x better than GPU FPGA: Easier to build custom interfaces

Frank Vahid, UCR 26 Current: Embedding-based placement of networks Heart cells Most physical models have a regular structure Meshes, trees, grids, etc. We can apply theoretical graph embedding techniques to embed models into FPGA Minimal network dilation Lungs Neuron mesh FPGA

Frank Vahid, UCR 27 Embedding-based placement of networks Physical model equations Physical placement Structured virtual PE graph Map equations to virtual PEs Map virtual PEs to physical PEs via embedding EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP7 EqV7 EqP5 EqV5 EqP6 EqV6 EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP6 EqV6 EqP5 EqV5 EqP7 EqV7 EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP7 EqV7 EqP5 EqV5 EqP6 EqV6 No placement strategy Simulated Annealing Placement Embedding Placement

Frank Vahid, UCR 28 Embedding-based placement of networks Work submitted to FPGA'13 (Miller/Vahid/Givargis) Not routable

Frank Vahid, UCR 29 Other projects Assistive monitoring –..\Desktop\Fall montage.mp4..\Desktop\Fall montage.mp4 Web-based learning –"Textbook is dead" –pcpp.zyante.com (C++)pcpp.zyante.com Embedded systems educ –New prog. model, virtual lab –Also riosscheduler.orgriosscheduler.org Drunk driving (DUI) –..\Desktop\dui.MOV..\Desktop\dui.MOV –duicam.orgduicam.org

Frank Vahid, UCR 30 Contributors Chen Huang (UC Riverside, now Amazon) Bailey Miller (UC Riverside) Prof. Tony Givargis (UC Irvine) Ting-Shuo Chou (UC Irvine) Others.....\Desktop\Meti ER 2.mov Fastest cost-effective execution of physical models Real-time (or faster) cyber- physical system testing Scientific research More apps

Frank Vahid, UCR 31 Key contributors Chen Huang (UC Riverside, now Amazon) Bailey Miller (UC Riverside) Prof. Tony Givargis (UC Irvine) Ting-Shuo Chou (UC Irvine) Others...