Frank Vahid, UCR 1 Building Fake Body Parts: Digital Mockups Frank Vahid Univ. of California, Riverside Support provided by NSF, SRC, and CareFusion
Frank Vahid, UCR 2 Building fake body parts How test medical equipment software?
Frank Vahid, UCR 3 Simulation: Slow/Inaccurate Accurate simulation is slow 2-3 minutes to simulate one breath accurately Decrease accuracy for real-time Weibel lung complexity 4 gen: 32 ODEs 6 gen: 128 ODEs 8 gen: 512 ODEs 10 gen: 2048 ODEs
Frank Vahid, UCR 4 Mockups Physical mockup Processing Core Transducers Device Digital communication Physical phenomena Physical phenomena disconnected Processing Core Transducers Device Transducer models Environment Model Intercepted transducer packets Digital Mockup How run in real-time? be.com/watch?f eature=player_e mbedded&v=rb0 ik1HopBk
Frank Vahid, UCR 5 Physical models are inherently parallel V[1],F[1] V[2],F[2] V[7],F[7] ODE dependency graph
Frank Vahid, UCR 6 GPUs Tried, failed –GPU research group also –(results later)
Frank Vahid, UCR 7 for (i=0; i < 128; i++) y[i] += c[i] * x[i].. FPGAs: Sw circuits (parallel) for (i=0; i < 128; i++) y += c[i] * x[i].. ************ C Code for FIR Filter Processor 1000’s of instructions –Several thousand cycles Circuit for FIR Filter Processor FPGA ~ 7 cycles (though slower clock) Speedup > 10x-100x
Frank Vahid, UCR 8 2x2 switch matrix y z w x FPGAs “101” (A Quick Intro) ab a1a0a1a0 4x2 Memory abab d 1 d 0 F G LUT FG ab SM ab c D E FPGA abc D E
Frank Vahid, UCR 9 Differential Equation Processing Element General PE Diffeq can't be solved exactly Use iterative approximation (Euler, RK4) Computes equation solutions at given timestep (e.g. 0.1 ms timesteps). Huang, Vahid, Givargis. A Custom FPGA Processor for Physical Model Ordinary Differential Equation Solving. Embedded Systems Letters, Dec, FPGA Digital mockup Interface DEPE Device under test
Frank Vahid, UCR 10 Single DEPE CPU(1),(4): Pentium IV, 3.0 GHz DEPE: Xilinx Virtex6-240T Microblaze: LUTs.
Frank Vahid, UCR 11 Homogeneous network of general PEs Map ODEs to homogeneous PE network ODE dependency graph Scheduling V[1],F[1] V[2],F[2] V[7],F[7] ODE dependency graph Huang, Vahid, Givargis Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012) FPGA Digital mockup Interface PE3 PE1 PE2 100s of PEs Synthesis tool PE1 PE2 PE3
Frank Vahid, UCR 12 Homogeneous network of general PEs FPGA Digital mockup
Frank Vahid, UCR 13 Homogeneous network of general PEs ODE mapping via simulated annealing 10K iterations 150K iterations
Frank Vahid, UCR 14 Homogeneous network of general PEs
Frank Vahid, UCR 15 Homogeneous network of general PEs – FPGA Usage 150KLuts available on Virtex6-240T utube.com/wa tch?v=ThUKV hqoA3Q Demo
Frank Vahid, UCR 16 Custom Processing Element Custom PE Custom datapath to solve specific type of equation Huang, Vahid, Givargis Synthesis of networks of custom processing elements for real-time physical system emulation. Transactions on Design Automation of Electronic Systems (TODAES). *To Appear (Dec-2012) MUL Const ROM Address Input_sel Address Inputs Output SUB Controller We Data RAM Controller PE SUBMUL FPGA Digital mockup Interface V’ = F 1 – F 2 F’ = P 1 -P 2 -(F*C R )*C L Custom PE for each ODE type
Frank Vahid, UCR 17 Custom Processing Element
Frank Vahid, UCR 18 Custom Processing Element – FPGA Usage
Frank Vahid, UCR 19 Networks of Heterogeneous Processing Elements Huang, Miller, Vahid, Givargis. Synthesis of Heterogeneous Processing Elements for Physical System Emulation. CODES+ISSS 2012, Oct, General PE: –Slow, flexible (can solve any types of ODEs) Custom PE: –Fast, Inflexible (only solves one type of ODEs) Multi-Type PE –Combined multiple types of ODEs into single custom PE FPGA Digital mockup Interface Huge solution space: How to choose types of PEs? How many PEs to allocate? How to bind ODEs to PEs?
Frank Vahid, UCR 20 Automatic allocation and binding Initial random allocation PE allocator ODE-to-PE mapper New PE allocation Cycles of each PE Better solution Best solution N Y Simulated Annealing
Frank Vahid, UCR 21 Networks of Heterogeneous Processing Elements
Frank Vahid, UCR 22 Heterogeneous Networks – FPGA Usage
Frank Vahid, UCR 23 Network of PEs VS GPU and PC Speedup vs real-time PC(1): 0.76x PC(4): 3.07x GPU: 1.63x HLS: 3.23x General PE: 4.94x Custom PE: 6.1x Hetero PE: 34.5x
Frank Vahid, UCR 24 Network of general/custom/heterogeneous PEs VS HLS (regularity extraction) Heterogeneous PE: (10x, 1.1x) HLS (7x, 0.85x) general PE (6x, 1.35x) custom PE (Speed, Size) Performance (ms): time to emulate 1000 ms, using Euler with 0.01 ms step. Size (equivalent LUTs)
Frank Vahid, UCR 25 Speedup / dollar CPU (I Intel X58 board): $480 GPU(GTX460 + I H55 board): $380 FPGA (Xilinx Virtex6 240T-2 board): $1800 Heterogeneous PEs: 3X better than PC(4) 4.5x better than GPU FPGA: Easier to build custom interfaces
Frank Vahid, UCR 26 Current: Embedding-based placement of networks Heart cells Most physical models have a regular structure Meshes, trees, grids, etc. We can apply theoretical graph embedding techniques to embed models into FPGA Minimal network dilation Lungs Neuron mesh FPGA
Frank Vahid, UCR 27 Embedding-based placement of networks Physical model equations Physical placement Structured virtual PE graph Map equations to virtual PEs Map virtual PEs to physical PEs via embedding EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP7 EqV7 EqP5 EqV5 EqP6 EqV6 EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP6 EqV6 EqP5 EqV5 EqP7 EqV7 EqP1 EqV1 EqP2 EqV2 EqP3 EqV3 EqP4 EqV4 EqP7 EqV7 EqP5 EqV5 EqP6 EqV6 No placement strategy Simulated Annealing Placement Embedding Placement
Frank Vahid, UCR 28 Embedding-based placement of networks Work submitted to FPGA'13 (Miller/Vahid/Givargis) Not routable
Frank Vahid, UCR 29 Other projects Assistive monitoring –..\Desktop\Fall montage.mp4..\Desktop\Fall montage.mp4 Web-based learning –"Textbook is dead" –pcpp.zyante.com (C++)pcpp.zyante.com Embedded systems educ –New prog. model, virtual lab –Also riosscheduler.orgriosscheduler.org Drunk driving (DUI) –..\Desktop\dui.MOV..\Desktop\dui.MOV –duicam.orgduicam.org
Frank Vahid, UCR 30 Contributors Chen Huang (UC Riverside, now Amazon) Bailey Miller (UC Riverside) Prof. Tony Givargis (UC Irvine) Ting-Shuo Chou (UC Irvine) Others.....\Desktop\Meti ER 2.mov Fastest cost-effective execution of physical models Real-time (or faster) cyber- physical system testing Scientific research More apps
Frank Vahid, UCR 31 Key contributors Chen Huang (UC Riverside, now Amazon) Bailey Miller (UC Riverside) Prof. Tony Givargis (UC Irvine) Ting-Shuo Chou (UC Irvine) Others...