Download presentation
Presentation is loading. Please wait.
1
Methods for Evaluation of Embedded Systems Simon Künzli, Alex Maxiaguine Institute TIK, ETH Zurich
2
System-Level Analysis RISC DSP LookUp Cipher IP Telephony Secure FTP Multimedia streaming Web browsing Memory ? Clock Rate ? Bus Load ? Packet Delays ? Resource Utilization ?
3
Problems for Performance Estimation RISC DSP SDRAM Arbiter Distributed processing of applications on different resources Interaction of different applications on different resources Heterogeneity, HW-SW
4
Complex run-time interdependencies Prof. Ernst, TU Braunschweig M2M2 IP 2 M3M3 M1M1 Com Netw DSP IP 1 HW CPU Sens run-time dependencies of independent components via communication influence on timing and power
5
A “nice-to-have” performance model measuring what we want high accuracy high speed full coverage based on unified formal specification model composability & parameterization reusable across different abstraction levels at least easy to refine
6
Overview over Existing Approaches speed accuracy Thiele Ernst Givargis Lahiri Benini RTL SPADE Jerraya
7
Discrete-event Simulation System Model Architecture and Behavior Components/Actors/Processes Communication channels/Signals Event Scheduler Event queue © The MathWorks future events (e.g. signal changes) actions to be executed Accuracy vs. Speed: How many events are simulated?
8
Discrete-event Simulation “The design space”: Time resolution Modeling communication Modeling timing of data-dependent execution …
9
Time Resolution x(t) t t2t2 t1t1 t3t3 t5t5 t4t4 t6t6 t7t7 t t2t2 t1t1 t3t3 t5t5 t4t4 t6t6 t7t7 discrete time cont. time a a c a c a a a a c a c a a accuracy Continuous time e.g. Gate-level simulation Discrete time or “cycle-accurate” e.g. Register Transfer Level (RTL) simulation system-level performance analysis
10
Modeling communication Pin-level model all signals are modeled explicitly often combined with RTL Transaction-level Model protocol details are abstracted e.g. burst mode transfers TLM simulator of AMBA bus x100 faster then pin-level model Caldari et al. Transaction-Level Models for AMBA Bus Architecture Using SystemC 2.0. DATE 2003 C1C2 ready d0 d1 d2 C1C2 transaction true/false
11
Modeling timing of data-dependent execution Problem: How to model timing of data- dependent functionality inside a component? Possible solution: Estimate and annotate delays in the functional/behavioral model: a=read(in) a > b task1() write(out,c) task2() inout d2d2 d1d1 a=read(in); if(a>b) { task1(); delay(d1); else { task2(); delay(d2);} write(out,c); this approach works well for HW but may be too coarse for modeling SW
12
HW/SW Cosimulation Options Application SW... … is delay-annotated & natively executes on workstation as a part of HW simulator … is compiled for target processor and its code is used as a stimuli to processor model that is a part of HW simulator … is not a part of the HW simulator -- a complete separation of Application and Architecture models
13
Processor Models: Simulation Environment HW Sim. (rest of the system) Processor Model wrapper RTL Microarch. Sim. ISS C/C++ Application SW Compiler.exe prog. code
14
Processor Models RTL model cycle-accurate or continuous time all the details are modeled (e.g. synthesizable) Microarchitecture Simulator cycle-accurate model models pipeline effects, etc can be generated automatically (e.g. Liberty, LISA…) Instruction Set Simulator provides instruction count functional models of instructions e.g. SimpleScalar
15
Multiprocessor System Simulator L Benini, U Bologna SystemC model Cycle-accurate ISS SystemC Wrapper
16
Comparison of HW/SW Co-simulation techniques simulatorspeed (instructions/sec) continuous time (nano-second accurate) 1 - 100 cycle-accurate50 – 1000 instruction level2000 – 20,000 J. Rowson, Hardware/Software Co-Simulation, Proceedings of the 31st DAC, USA,1994
17
HW/SW Co-simulation Options Application SW... … is delay-annotated & natively executes on workstation as a part of HW simulator … is compiled for target processor and its code is used as a stimuli to processor model that is a part of HW simulator … is not a part of the HW simulator -- a complete separation of Application and Architecture models
18
Independent Application and Architecture Models (“Separation of Concerns”) RISC DSP SRAM Application Architecture Mapping WORKLOAD RESOURCES
19
Co-simulation of Application and Architecture Models Basic principle: Application (or functional) simulator drives architecture (or hardware) simulator The models interact via traces of actions The traces are produced on-line or off-line Advantages: system-level view flexible choice of abstraction level the models and the mapping can be easily altered
20
Trace-driven Simulation SPADE: System level Performance Analysis and Design space Exploration Application model Architecture model P. Lieverse et al., U Delft & Philips
21
Trace-driven Simulation (SPADE) Lieverse et al., U Delft & Philips
22
Going away from discrete-event simulation… Analysis for Communication Systems Lahiri et al., UC San Diego A two-step approach: 1.simulation without communication (e.g. using ISS) 2.analysis for different communication architectures K. Lahiri, UCSD
23
Overview K. Lahiri, UCSD
24
Analytical Methods for Power Estimation Givargis et al. UC Riverside Analytical models for power consumption of : Caches Buses two-step approach for fast power evaluation collect intermediate data using simulation use equations to rapidly predict power couple with a fast bus estimation approach
25
Approach Overview Givargis, UC Riverside Bus equation: m items/second (denotes the traffic N on the bus) n bits/item k bit wide bus bus-invert encoding random data assumption
26
Experiment Setup Givargis, UC Riverside C Program Trace Generator Cache Simulator CPU Power ISS Performance + Power Memory Power Bus Simulator I/D Cache Power Dinero [Edler, Hill] CPU power [Tiwari96]
27
Analytical Method scheduling discipline 1 e1e1 e2e2 CPU 1 scheduling discipline 2 e3e3 e4e4 CPU 2 ? ? Workload ?
28
periodic with jitter JJJ TT periodic with burst T b t b t periodic TT sporadic xtxtxtxtxtxt Event Model Interface Classification Ernst, TU Braunschweig jitter = 0burst length (b) = 1 t = T - J t = T t = t lossless EMIF EMIF to less expressive model T=T, t=T, b=1 T=T, J=0
29
Example: EMIFs & EAFs scheduling discipline 1 e1e1 e2e2 CPU 1 scheduling discipline 2 e3e3 e4e4 CPU 2 ? ? EMIF EAF Event model interface needed Event adaptation function needed Use standard scheduling analysis for single components.
30
Using EMIFs and EAFs Ernst, TU Braunschweig Sporadic Periodic with BurstPeriodic with Jitter Periodic EAF buffer required upper bound only
31
General Framework Functional Task Model Abstract Task Model Architecture Model Abstract Components (Run-Time Environment) T1T2T3 ARM9DSP Abstract Architecture load scenarios resource units mapping relations functiona l units event stream s abstract resource units abstract functional units abstract event streams abstract load scenarios
32
max: 2 packets min: 0 packets max: 3 packets min: 1 packet uu ll Event & Resource Models use arrival curves to capture event streams use service curves to capture processing capacity time t max: 1 packet min: 0 packets 012 # of packets 1 2 3
33
Analysis for a Single Component
34
Analysis – Bounds on Delay & Memory u,l u,l delay d backlog b service curve l arrival curve u b
35
Comparison between diff. Approaches Simulation-Based can answer virtually any questions about performance can model arbitrary complex systems average case (single instance) time-consuming accurate Analytical Methods possibilities to answer questions limited by method restricted by underlying models good coverage (worst case) fast coarse
36
Example: IBM Network Processor
37
Comparison RTC vs. Simulation
38
Experiment Results Givargis, UC Riverside Diesel application’s performance Blue is obtained using full simulation Red is obtained using our equations 4% error 320x faster
39
Concluding Remarks
40
Backup
41
Metropolis Framework Cadence Berkeley Lab & UC Berkeley
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.