Fast Simulation Techniques for Design Space Exploration Daniel Knorreck, Ludovic Apvrille, Renaud Pacalet

Slides:



Advertisements
Similar presentations
Copyright 2000 Cadence Design Systems. Permission is granted to reproduce without modification. Introduction An overview of formal methods for hardware.
Advertisements

Computer Architecture
Computer Organization and Architecture
ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3 3 Operating System Organization.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir.
© 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 9 Slide 1 Appendix 3 Object-Oriented Analysis and Design.
Cache Memory Locality of reference: It is observed that when a program refers to memory, the access to memory for data as well as code are confined to.
Lecture Objectives: 1)Explain the limitations of flash memory. 2)Define wear leveling. 3)Define the term IO Transaction 4)Define the terms synchronous.
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Source Code Optimization and Profiling of Energy Consumption in Embedded System Simunic, T.; Benini, L.; De Micheli, G.; Hans, M.; Proceedings on The 13th.
Partial Order Reduction for Scalable Testing of SystemC TLM Designs Sudipta Kundu, University of California, San Diego Malay Ganai, NEC Laboratories America.
SNAL Sensor Networks Application Language Alvise Bonivento Mentor: Prof. Sangiovanni-Vincentelli 290N project, Fall 04.
Visual Basic Introduction IDS 306 from Shelly, Cashman & Repede Microsoft Visual Basic 5: Complete Concepts and Techniques.
Mehmet Can Vuran, Instructor University of Nebraska-Lincoln Acknowledgement: Overheads adapted from those provided by the authors of the textbook.
GCSE Computing - The CPU
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
Teamwork Know each other Compete Leadership Strengths and Weaknesses
The University of New Hampshire InterOperability Laboratory Serial ATA (SATA) Protocol Chapter 10 – Transport Layer.
C.S. Choy95 COMPUTER ORGANIZATION Logic Design Skill to design digital components JAVA Language Skill to program a computer Computer Organization Skill.
System Calls 1.
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Input and Output Computer Organization and Assembly Language: Module 9.
Automatic Communication Refinement for System Level Design Samar Abdi, Dongwan Shin and Daniel Gajski Center for Embedded Computer Systems, UC Irvine
Processes and Threads CS550 Operating Systems. Processes and Threads These exist only at execution time They have fast state changes -> in memory and.
SystemC and Levels of System Abstraction: Part I.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
I/O management is a major component of operating system design and operation Important aspect of computer operation I/O devices vary greatly Various methods.
Dhanshree Nimje Smita Khartad
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
The Central Processing Unit (CPU) and the Machine Cycle.
EEE440 Computer Architecture
MODUS Project FP7- SME – , Eclipse Conference Toulouse, May 6 th 2013 Page 1 MODUS Project FP Methodology and Supporting Toolset Advancing.
RTX - 51 Objectives  Resources needed  Architecture  Components of RTX-51 - Task - Memory pools - Mail box - Signals.
Processor Architecture
Computer Organization CDA 3103 Dr. Hassan Foroosh Dept. of Computer Science UCF © Copyright Hassan Foroosh 2002.
Teaching The Principles Of System Design, Platform Development and Hardware Acceleration Tim Kranich
Overview von Neumann Architecture Computer component Computer function
Verification of Behavioral Consistency in C by Using Symbolic Simulation and Program Slicer Takeshi Matsumoto Thanyapat Sakunkonchak Hiroshi Saito Masahiro.
Question What technology differentiates the different stages a computer had gone through from generation 1 to present?
Fundamentals of Programming Languages-II
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
بسم الله الرحمن الرحيم MEMORY AND I/O.
SystemC Semantics by Actors and Reduction Techniques in Model Checking Marjan Sirjani Formal Methods Lab, ECE Dept. University of Tehran, Iran MoCC 2008.
1 Device Controller I/O units typically consist of A mechanical component: the device itself An electronic component: the device controller or adapter.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 4: Processes Process Concept Process Scheduling Types of shedulars Process.
1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
1 The user’s view  A user is a person employing the computer to do useful work  Examples of useful work include spreadsheets word processing developing.
Embedded Real-Time Systems
System-on-Chip Design Homework Solutions
Architectures of Digital Information Systems Part 1: Interrupts and DMA dr.ir. A.C. Verschueren Eindhoven University of Technology Section of Digital.
Processes and threads.
EmuOS Phase 3 Design Brendon Drew Will Mosley Anna Clayton
Micro-programmed Control
Introduction Introduction to VHDL Entities Signals Data & Scalar Types
Overheads for Computers as Components
Micro-programmed Control Unit
Design Flow System Level
Process management Information maintained by OS for process management
Instruction Level Parallelism and Superscalar Processors
Computer System Overview
CoCentirc System Studio (CCSS) by
Concurrency, Processes and Threads
Chapter 13: I/O Systems.
Presentation transcript:

Fast Simulation Techniques for Design Space Exploration Daniel Knorreck, Ludovic Apvrille, Renaud Pacalet

slide 2 2 Outline DIPLODOCUS basics New simulation strategy Case study MPEG decoder Conclusions and Future Work

slide 3 DIPLODOCUS basics

slide 4 DIPLODOCUS on one slide Platform for efficient Design Space Exploration of SoCs Clear separation between -Applications -Architecture -Mapping Data abstraction Control flow oriented Simulation and formal analysis on abstract models Our environment is based on UML as modeling language LOTOS and UPPAAL for formal analysis SystemC/C++ for simulation

slide 5 Methodology Application modeling Architecture modeling DSE mapping Simulation Static analysis Simulation Static analysis

slide 6 Toolkit: TTool

slide 7 DIPLODOCUS: Task Diagram Declaration of a task Event. Used for inter-task signaling. Type may be infinite FIFO or finite FIFO. When a finite FIFO is full, the older event is erased. Events may carry values. Event. Used for inter-task signaling. Type may be infinite FIFO or finite FIFO. When a finite FIFO is full, the older event is erased. Events may carry values. Request. Use to spawn a task if an instance of that task is not currently executing. Channel. Do not convey value: they are meant to model a number of exchanged samples. cha1 = event name Three channel types: BR-BW: Blocking Read – Blocking write (= Finite FIFO) BR-NBW: Blocking Read – Non Blocking Write (= infinite FIFO) NBR-NBW: Non Blocking Read - Non Blocking Write (= shared memory) Channel. Do not convey value: they are meant to model a number of exchanged samples. cha1 = event name Three channel types: BR-BW: Blocking Read – Blocking write (= Finite FIFO) BR-NBW: Blocking Read – Non Blocking Write (= infinite FIFO) NBR-NBW: Non Blocking Read - Non Blocking Write (= shared memory)

slide 8 DIPLODOCUS: Application Modeling A behavior must be provided for each task UML activity diagram Usual control operators -Loops -Choices Channels -Write x samples on a channel -Read x samples from a channel Events -Send, receive an event -Test whether an event may be received -Select between events Requests -Send a request

slide 9 DIPLODOCUS: Task Behavior Sending of request req1 with “1” as natural parameter Loop Sending of event done Receiving of one data sample on channel cha1 Loop condition is false Loop condition is true Receiving of one data sample on channel cha1 Modeling between 1 and 2 execution instructions on an integer unit. It has no meaning at application modeling level.

slide 10 DIPLODOCUS: Mapping

slide 11 New Simulation Strategy

slide 12 Motivations for a new Simulator Existing SystemC based simulator Relies on the freely available SystemC kernel One SystemC Task per CPU, Bus, Memory Simulation on cycle accurate level And so… that simulator is quite slow New simulator implemented in pure C++: No overhead due to the SystemC kernel Coarse grained simulation strategy based on transactions comprising several cycles Simulation granularity is automatically adapted to the application

slide 13 Architecture of the Simulator For the sake of comprehensibility, many sub-classes have been omitted and merely inheritance relationships are shown.

slide 14 Transactions Merges several clock cycles, contains penalties Important parameters: virtualLengh: number of virtual execution units length: duration of the transaction in time units runnableTime: time when it becomes runnable startTime: execution starts at this time penalties: task switching, branching, idle Transaction travels through simulator: Command Channel CPU Bus Slave

slide 15 Basic Simulation Strategy in one slide CPU 2 CPU 1 T11 T12 T21T22 T11 T21 T22 T11 T21  T23 CPU 1 CPU 2 T22 T11 T21 CPU 1 CPU 2 T12 CPU 1 CPU 2 activate

slide 16 Hierarchical scheduling process

slide 17 Simulation phases Three phases are entered alternately Preparation Phase -Check if current command has been processed entirely, proceed to next command if necessary -Create next transaction, register transaction at channel if necessary Scheduling procedure Execution phase -Issue read/write operations on channels -Update progress of command -Add transaction to schedule

slide 18 Case study MPEG decoder

slide 19 10/24/2015 DIPLODOCUS: System Level Design Space Exploration Task diagram (Data dependencies) (processing sequence)

slide 20 DIPLODOCUS: System Level Design Space Exploration Task: Parser Sequence header Picture, Slice, Macroblock header, to be refined Launch processing No of coded blocks Picture type decision Picture format Picture type decision

slide 21 Conclusions and Future Work

slide 22 Simulation Strategy: Summary Strength: simulation time increases with the number of transactions and NOT with the number of clock cycles Thus in general, and take the same execution time. Scenario 1: Task 1 executes, after that, Task 2 executes a million times result: 1,000,001 transactions Scenario 2: same as before, but the tasks execute the read/write commands concurrently: result: 2,000,000 transactions split of write transaction is necessary to leave the decision which task executes next up to the CPU scheduler

slide 23 Conclusions Implementation of a simulation environment Simulation granularity automatically adapts to the application model Based on pure C++ Simulation speed up by 6x up to 30x or more depending on the application granularity MPEG case study

slide 24 Future Work Extension of the simulation environment: Refinement of bus and memory model Refinement of the hardware accelerator component MPEG case study, using meta-data to direct control flow Longer-term objectives: Verification of functional requirements during simulation Exploration of several branches of control flow, possibility to return to a previous system state Technical improvements of the simulator

slide 25 Thank You! Questions?