Presentation is loading. Please wait.

Presentation is loading. Please wait.

Methods for Evaluation of Embedded Systems Simon Künzli, Alex Maxiaguine Institute TIK, ETH Zurich.

Similar presentations


Presentation on theme: "Methods for Evaluation of Embedded Systems Simon Künzli, Alex Maxiaguine Institute TIK, ETH Zurich."— Presentation transcript:

1 Methods for Evaluation of Embedded Systems Simon Künzli, Alex Maxiaguine Institute TIK, ETH Zurich

2 System-Level Analysis RISC DSP LookUp Cipher IP Telephony Secure FTP Multimedia streaming Web browsing Memory ? Clock Rate ? Bus Load ? Packet Delays ? Resource Utilization ?

3 Problems for Performance Estimation RISC DSP SDRAM Arbiter Distributed processing of applications on different resources Interaction of different applications on different resources Heterogeneity, HW-SW

4 Complex run-time interdependencies  Prof. Ernst, TU Braunschweig M2M2 IP 2 M3M3 M1M1 Com Netw DSP IP 1 HW CPU Sens run-time dependencies of independent components via communication influence on timing and power

5 A “nice-to-have” performance model measuring what we want high accuracy high speed full coverage based on unified formal specification model composability & parameterization reusable across different abstraction levels  at least easy to refine

6 Overview over Existing Approaches speed accuracy Thiele Ernst Givargis Lahiri Benini RTL SPADE Jerraya

7 Discrete-event Simulation System Model Architecture and Behavior Components/Actors/Processes Communication channels/Signals Event Scheduler Event queue © The MathWorks future events (e.g. signal changes) actions to be executed Accuracy vs. Speed: How many events are simulated?

8 Discrete-event Simulation “The design space”:  Time resolution  Modeling communication  Modeling timing of data-dependent execution  …

9 Time Resolution x(t) t t2t2 t1t1 t3t3 t5t5 t4t4 t6t6 t7t7 t t2t2 t1t1 t3t3 t5t5 t4t4 t6t6 t7t7 discrete time cont. time a a c a c a a a a c a c a a accuracy Continuous time  e.g. Gate-level simulation Discrete time or “cycle-accurate”  e.g. Register Transfer Level (RTL) simulation  system-level performance analysis

10 Modeling communication Pin-level model  all signals are modeled explicitly  often combined with RTL Transaction-level Model  protocol details are abstracted  e.g. burst mode transfers TLM simulator of AMBA bus x100 faster then pin-level model Caldari et al. Transaction-Level Models for AMBA Bus Architecture Using SystemC 2.0. DATE 2003 C1C2 ready d0 d1 d2 C1C2 transaction true/false

11 Modeling timing of data-dependent execution Problem: How to model timing of data- dependent functionality inside a component? Possible solution: Estimate and annotate delays in the functional/behavioral model: a=read(in) a > b task1() write(out,c) task2() inout d2d2 d1d1 a=read(in); if(a>b) { task1(); delay(d1); else { task2(); delay(d2);} write(out,c); this approach works well for HW but may be too coarse for modeling SW

12 HW/SW Cosimulation Options Application SW... … is delay-annotated & natively executes on workstation as a part of HW simulator … is compiled for target processor and its code is used as a stimuli to processor model that is a part of HW simulator … is not a part of the HW simulator -- a complete separation of Application and Architecture models

13 Processor Models: Simulation Environment HW Sim. (rest of the system) Processor Model wrapper RTL Microarch. Sim. ISS C/C++ Application SW Compiler.exe prog. code

14 Processor Models RTL model  cycle-accurate or continuous time  all the details are modeled (e.g. synthesizable) Microarchitecture Simulator  cycle-accurate model  models pipeline effects, etc  can be generated automatically (e.g. Liberty, LISA…) Instruction Set Simulator  provides instruction count  functional models of instructions e.g. SimpleScalar

15 Multiprocessor System Simulator  L Benini, U Bologna SystemC model Cycle-accurate ISS SystemC Wrapper

16 Comparison of HW/SW Co-simulation techniques simulatorspeed (instructions/sec) continuous time (nano-second accurate) 1 - 100 cycle-accurate50 – 1000 instruction level2000 – 20,000 J. Rowson, Hardware/Software Co-Simulation, Proceedings of the 31st DAC, USA,1994

17 HW/SW Co-simulation Options Application SW... … is delay-annotated & natively executes on workstation as a part of HW simulator … is compiled for target processor and its code is used as a stimuli to processor model that is a part of HW simulator … is not a part of the HW simulator -- a complete separation of Application and Architecture models

18 Independent Application and Architecture Models (“Separation of Concerns”) RISC DSP SRAM Application Architecture Mapping WORKLOAD RESOURCES

19 Co-simulation of Application and Architecture Models Basic principle:  Application (or functional) simulator drives architecture (or hardware) simulator  The models interact via traces of actions  The traces are produced on-line or off-line Advantages:  system-level view  flexible choice of abstraction level  the models and the mapping can be easily altered

20 Trace-driven Simulation SPADE: System level Performance Analysis and Design space Exploration Application model Architecture model  P. Lieverse et al., U Delft & Philips

21 Trace-driven Simulation (SPADE)  Lieverse et al., U Delft & Philips

22 Going away from discrete-event simulation… Analysis for Communication Systems Lahiri et al., UC San Diego A two-step approach: 1.simulation without communication (e.g. using ISS) 2.analysis for different communication architectures  K. Lahiri, UCSD

23 Overview  K. Lahiri, UCSD

24 Analytical Methods for Power Estimation Givargis et al. UC Riverside Analytical models for power consumption of :  Caches  Buses two-step approach for fast power evaluation  collect intermediate data using simulation  use equations to rapidly predict power  couple with a fast bus estimation approach

25 Approach Overview  Givargis, UC Riverside Bus equation: m items/second (denotes the traffic N on the bus) n bits/item k bit wide bus bus-invert encoding random data assumption

26 Experiment Setup  Givargis, UC Riverside C Program Trace Generator Cache Simulator CPU Power ISS Performance + Power Memory Power Bus Simulator I/D Cache Power Dinero [Edler, Hill] CPU power [Tiwari96]

27 Analytical Method scheduling discipline 1 e1e1 e2e2 CPU 1 scheduling discipline 2 e3e3 e4e4 CPU 2 ? ? Workload ?

28 periodic with jitter JJJ TT periodic with burst T b t b t periodic TT sporadic xtxtxtxtxtxt Event Model Interface Classification  Ernst, TU Braunschweig jitter = 0burst length (b) = 1 t = T - J t = T t = t lossless EMIF EMIF to less expressive model T=T, t=T, b=1 T=T, J=0

29 Example: EMIFs & EAFs scheduling discipline 1 e1e1 e2e2 CPU 1 scheduling discipline 2 e3e3 e4e4 CPU 2 ? ? EMIF EAF Event model interface needed Event adaptation function needed Use standard scheduling analysis for single components.

30 Using EMIFs and EAFs  Ernst, TU Braunschweig Sporadic Periodic with BurstPeriodic with Jitter Periodic EAF buffer required upper bound only

31 General Framework Functional Task Model Abstract Task Model Architecture Model Abstract Components (Run-Time Environment) T1T2T3 ARM9DSP Abstract Architecture load scenarios resource units mapping relations functiona l units event stream s abstract resource units abstract functional units abstract event streams abstract load scenarios

32 max: 2 packets min: 0 packets max: 3 packets min: 1 packet uu ll   Event & Resource Models use arrival curves to capture event streams use service curves to capture processing capacity time t max: 1 packet min: 0 packets  012 # of packets 1 2 3

33 Analysis for a Single Component

34 Analysis – Bounds on Delay & Memory  u,l  u,l delay d backlog b service curve  l arrival curve  u b

35 Comparison between diff. Approaches Simulation-Based can answer virtually any questions about performance can model arbitrary complex systems average case (single instance) time-consuming accurate Analytical Methods possibilities to answer questions limited by method restricted by underlying models good coverage (worst case) fast coarse

36 Example: IBM Network Processor

37 Comparison RTC vs. Simulation

38 Experiment Results  Givargis, UC Riverside Diesel application’s performance Blue is obtained using full simulation Red is obtained using our equations 4% error 320x faster

39 Concluding Remarks

40 Backup

41 Metropolis Framework  Cadence Berkeley Lab & UC Berkeley


Download ppt "Methods for Evaluation of Embedded Systems Simon Künzli, Alex Maxiaguine Institute TIK, ETH Zurich."

Similar presentations


Ads by Google