Peter Poplavko, Saddek Bensalem, Marius Bozga

Slides:



Advertisements
Similar presentations
U of Houston – Clear Lake
Advertisements

Undoing the Task: Moving Timing Analysis back to Functional Models Marco Di Natale, Haibo Zeng Scuola Superiore S. Anna – Pisa, Italy McGill University.
MotoHawk Training Model-Based Design of Embedded Systems.
Model for Supporting High Integrity and Fault Tolerance Brian Dobbing, Aonix Europe Ltd Chief Technical Consultant.
Architecture Modeling and Analysis for Embedded Systems Oleg Sokolsky CIS700 Fall 2005.
1 Concurrent and Distributed Systems Introduction 8 lectures on concurrency control in centralised systems - interaction of components in main memory -
Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.
EECC722 - Shaaban #1 Lec # 10 Fall Conventional & Block-based Trace Caches In high performance superscalar processors the instruction fetch.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Ritu Varma Roshanak Roshandel Manu Prasanna
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
February 21, 2008 Center for Hybrid and Embedded Software Systems Mapping A Timed Functional Specification to a Precision.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2009 Dataflow.
Interfaces for Control Components Rajeev Alur University of Pennsylvania Joint work with Gera Weiss (and many others)
Dynamic Reconfiguration of Component-based Real-time Software Words February 2005 Sedona, Arizona, USA Andreas Rasche, Andreas Polze and Martin.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
EECC722 - Shaaban #1 Lec # 9 Fall Conventional & Block-based Trace Caches In high performance superscalar processors the instruction fetch.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
RTS Meeting 8th July 2009 Introduction Middleware AUTOSAR Conclusion.
REXAPP Bilal Saqib. REXAPP  Radio EXperimentation And Prototyping Platform Based on NOC  REXAPP Compiler.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Kevin Ross, UCSC, September Service Network Engineering Resource Allocation and Optimization Kevin Ross Information Systems & Technology Management.
Models for Deterministic Execution of Real-time Multiprocessor Applications Peter Poplavko, Dario Socci, Paraskevas Bourgos, Saddek Bensalem, Marius Bozga.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Imperial College - Department of Computing Continuous Performance Testing in Virtual Time Nikos Baltas & Tony Field Department of Computing Imperial College.
LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Conformance Test Experiments for Distributed Real-Time Systems Rachel Cardell-Oliver Complex Systems Group Department of Computer Science & Software Engineering.
Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
> Power Supervison Desired Output level Source Diesel Valve Sink Diesel Valve > Valve Regulator Sink T = 40 ms Air Valve CBSE Course The SaveComp Component.
Processor Architecture
High-level Interfaces for Scalable Data Mining Ruoming Jin Gagan Agrawal Department of Computer and Information Sciences Ohio State University.
Networked Embedded Control System - Integration of control and computing Moonju Park Dept. of Computer Science & Engineering University of Incheon 1.
GeantV – Adapting simulation to modern hardware Classical simulation Flexible, but limited adaptability towards the full potential of current & future.
CIS 375 Bruce R. Maxim UM-Dearborn
CHaRy Software Synthesis for Hard Real-Time Systems
R4.21 – Public Report on "Scilab/Scicos code generation for IFP platform and real-time multicore code generation with SynDEx" Simon Nivault, Yves Sorel.
Current Generation Hypervisor Type 1 Type 2.
Advanced Operating Systems CIS 720
Conception of parallel algorithms
SOFTWARE DESIGN AND ARCHITECTURE
Parallel Programming By J. H. Wang May 2, 2017.
Computer Structure Multi-Threading
Chapter 2 Processes and Threads Today 2.1 Processes 2.2 Threads
INTEL HYPER THREADING TECHNOLOGY
Parallel Algorithm Design
Real-time Software Design
Liang Chen Advisor: Gagan Agrawal Computer Science & Engineering
Derek Chiou The University of Texas at Austin
CMSC 611: Advanced Computer Architecture
Introduction to cosynthesis Rabi Mahapatra CSCE617
CSCI1600: Embedded and Real Time Software
Improved schedulability on the ρVEX polymorphic VLIW processor
Shanna-Shaye Forbes Ben Lickly Man-Kit Leung
Hardware Multithreading
P. Poplavko, D. Socci, R. Kahil, M. Bozga, S. Bensalem
Simulation of computer system
CS 501: Software Engineering Fall 1999
Jinquan Dai, Long Li, Bo Huang Intel China Software Center
Hardware Counter Driven On-the-Fly Request Signatures
A GUI Based Aid for Generation of Code-Frameworks of TMOs
The Vector-Thread Architecture
ESE535: Electronic Design Automation
BigSim: Simulating PetaFLOPS Supercomputers
IDEA 2016 I nvestigating D ataflow in E mbedded computing
Programming with Shared Memory Specifying parallelism
Chapter 13: I/O Systems.
CSCI1600: Embedded and Real Time Software
Presentation transcript:

Peter Poplavko, Saddek Bensalem, Marius Bozga Mixed-Critical Systems Design with Coarse-grained Multi-core Interference Rany Kahil, Peter Poplavko, Saddek Bensalem, Marius Bozga

Motivation Autonomous systems need to adapt to unpredictable situations - adaptive resource management => give resources to critical functions in emergency mixed-criticality – switch to emergency mode even in the middle of a schedule Multi-core systems we consider compute-intensive and hence multi-core applications there are no clear “winners” / standards for scheduling on multi-cores need in high-level concurrency language for programming middleware interference : parasitic influences between multiple cores task dependencies are more challenging to handle on multi-cores => Need in mixed-criticality / interference / task-dependency aware resource management in high-level concurrency language Proposed Design Flow and mapping of the required activities to the work program task structure

Plan Our approach to three dimensions of the problem: Resource management in concurrency programming language – BIP Timing-critical system design with task dependencies – MoC Controlling multi-core interference at task level – -model Design flow Conclusions and Future Work Proposed Design Flow and mapping of the required activities to the work program task structure

Plan Our approach to three dimensions of the problem: Resource management in concurrency programming language – BIP Timing-critical system design with task dependencies – MoC Controlling multi-core interference at task level – -model Design flow Conclusions and Future Work Proposed Design Flow and mapping of the required activities to the work program task structure

SW Model Automata SW Model Compiled to a “Concurrency Language” The BIP Framework SQR_Start S1 Arrive reset x SQR_Finish Deadline when [ x = T ] Task Controller ( T ) void SQR_Init() { index = 0; } void SQR_Execute() { XIF_Read(&x, &x_valid); if (x_valid) { y = x * x; YIF_Write(&y); index = index + 1; provided interface period T discrete transition continuous transition port when [x = T] timing condition [ valid ] data condition reset x timing action y := x*x data action multi-port connector state SW Model Automata required interfaces

BIP Representation of the System Distr. System Node ? TC1 Resource Manager TC2 TC3 Dependency Pattern (MoC) Remote Node T1 Output Queue Inpuit Queue Network T2 T1 T2 T3 Proposed Design Flow and mapping of the required activities to the work program task structure

Mixed Criticality Resource Managament Normal Mode Emergency Mode Shared Resources Shared Resources HI LO mode switch HI Proc. Cores HI LO Proc. Cores HI LO Utilization, % Utilization, % HI - high critical tasks LO – low-critical tasks Proposed Design Flow and mapping of the required activities to the work program task structure

Resource Manager in BIP: Example Proposed Design Flow and mapping of the required activities to the work program task structure

Plan Our approach to three dimensions of the problem: Resource management in concurrency programming language – BIP Timing-critical system design with task dependencies – MoC Controlling multi-core interference at task level – -model Design flow Conclusions and Future Work Proposed Design Flow and mapping of the required activities to the work program task structure

Task dependency patterns (Models of Computation) Common Scheduling = Independent Task Model (ITM) however tasks are functionally dependent therefore, the schedule needs manual adjustments (problematic) The most expressive task model = Task Automata Model (TAM) can model virtually any dependencies problems: TAM schedulability analysis is not scalable Our approach: ITM < Dependency Pattern (MoC) < TAM we use TAM language (“BIP”) to represent dependency patterns this results in a scalable (though MoC-specific) schedulability analysis Proposed Design Flow and mapping of the required activities to the work program task structure

Our Current MoC: FPPN Common Scheduling = Independent Task Model (ITM) however tasks are functionally dependent therefore, the schedule needs manual adjustments The most expressive task model = Task Automata Model (TAM) can model virtually any dependencies problems: TAM schedulability analysis (e.g. UPPAAL) is not scalable Our approach: ITM < FPPN < TAM we use TAM language “BIP” and follow FPPN dependency patterns in it scalable, though FPPN-specific, schedulability analysis Fixed Priority Process Network (FPPN) - connected periodic / sporadic tasks - access data samples in time-stamp and precedence-consistent order Sink1 200ms Source1 Stage1 100ms Stage2 FPPN scheduling - static task graph - precedence-aware utilization - DAG scheduling heuristics - end-to-end latency constraints Proposed Design Flow and mapping of the required activities to the work program task structure

Dependency Pattern Example B A T=100 C=60 B T=200 C=60 shared 100 200 time A1 B1 A2 Wanted Schedule: (access order to shared buffer): End-to-end A-to-B latency Lmax = 120 The “language” of possible task sequences, e.g. (ABA)* Proposed Design Flow and mapping of the required activities to the work program task structure

Scheduling with Dependency Patterns (MoCs) Offline Scheduling Tool A [0,100] b [0,200] Task Graph A B Resource Manager Dependency Pattern Extract Schedule Constraints TCtrl1 TCtrl2 TCtrl3 TCtrl4 Task1 Task2 Task3 Task4

Plan Our approach to three dimensions of the problem: Resource management in concurrency programming language – BIP Timing-critical system design with task dependencies – MoC Controlling multi-core interference at task level – -model Design flow Conclusions and Future Work Proposed Design Flow and mapping of the required activities to the work program task structure

Interference Interference Interference occurs in multi-core systems Shared hardware resources: global bus, IO, even FPU,… Shared logical resources: critical regions in OS services Fine-grain (sporadic) and coarse-grain (concentrated) fine-grain= basic blocks coarse-grain: “superblocks” Consequences Uncontrolled interference => tasks take more cycles than WCET Over-controlled interference => run tasks fully sequentially We choose “something in the middle” Proposed Design Flow and mapping of the required activities to the work program task structure

Interference in Scheduling -  Model coarse-grain interference  suppose: superblock followed by execution-block if scheduled at the same time – interference controlled interference: schedule J1 after J3 “resource” BIP representation logical resource = “engine” (controllers + RTE) Resource Manager Dependency Pattern Task1 TCtrl1 Task2 TCtrl2 Task3 TCtrl3 Task4 TCtrl4 Resource Manager Dependency Pattern Task1 TCtrl1 Task2 TCtrl2 Task3 TCtrl3 Task4 TCtrl4 processors engine superblock = BIP transition, cost  other coarse-grained block: (e.g. of L2, FPU, etc) add controller, implement as BIP transition(s) not modeled: fine-grained interference: nevertheless, time-triggered schedule => improved predictability Proposed Design Flow and mapping of the required activities to the work program task structure

Plan Our approach to three dimensions of the problem: Resource management in concurrency programming language – BIP Timing-critical system design with task dependencies – MoC Controlling multi-core interference at task level – -model Design flow Conclusions and Future Work Proposed Design Flow and mapping of the required activities to the work program task structure

novelties and limitations: Design Flow novelties and limitations: task graph extraction Multi-Core Platform Task Automata Generator ( Dependency Patterns + Tasks) Res Man. Configuration BIP Compiler Task Model + Resource Manager (BIP) BIP RTE Static Schedule SW Model Interface Structure Functional-Code (C) Offline Scheduler Task Graph  interference model added scheduling with  model WIP as task migration not yet supported in BIP RTE WIP Proposed Design Flow and mapping of the required activities to the work program task structure

Application Specification in SW Model 25ms split 25ms B 25ms Proposed Design Flow and mapping of the required activities to the work program task structure

Task Graph Extraction Ji : Ai = 0, Di = 25 ms ,  = 1 ms SW Model Interface Structure Functional-Code (C) Task Graph J1 A [1] (12) ms J2 split [1] (1) ms J3 B [1] (6) ms Ji : Ai = 0, Di = 25 ms ,  = 1 ms Proposed Design Flow and mapping of the required activities to the work program task structure

Modeling Engine Interference in Task Graph Start S1 Arrive reset x Finish Deadline when [ x = T ] Task Controller arri (0) ddli (0) insert TC transition nodes Ji (Ci) Ji (Ci) BIP Engine (or Shared Res)  fini (0) insert  nodes Core k Ji nodej (Cj) nodej (Cj) engj () Time

 interference model added Static Schedule  interference model added Proposed Design Flow and mapping of the required activities to the work program task structure

Code Generation SW Model core0 cPeriodicSourcesplit core0 cRelDlSinksplit core0 cPeriodicSourceprocessb core0 cRelDlSinkprocessb core0 cPeriodicSourceprocessa core0 cRelDlSinkprocessa core0 xmailbox_ins core0 xmailbox2_ins core0 Prec_split_processb core0 Prec_split_processa core1 split_ins core1 processa_ins core2 processb_ins Multi-Core Platform Task Automata Generator ( Dependency Patterns + Tasks) BIP Compiler Task Model Resource Manager (BIP) BIP RTE SW Model Interface Structure Functional-Code (C) Dependency Pattern Task1 TCtrl1 Task2 TCtrl2 Task3 TCtrl3 Proposed Design Flow and mapping of the required activities to the work program task structure

Simulation on Workstation Multi-Core Platform Task Automata Generator ( Dependency Patterns + Tasks) BIP Compiler Task Model (BIP) BIP RTE SW Model Interface Structure Functional-Code (C) assert@SW Modelvm:~/poplavko/MCS_TOOLS/SW Model2bip/examples/parallelproc/fppn$ ./fppn-dolc_SWModel.bip.x --realtime --limittime 25 Running "realtime" mode BIP multi-threaded engine. Version: Engines: MixedCriticalSystems@6808 No. of atoms in system = 13 No. of threads in system = 3 Limit time = 25 ====> LAUNCH: <==== scheduler [0] BIP_Top.cPeriodicSourceprocessb.Arrive|BIP_Top.cRelDlSinkprocessb.Arrive|BIP_Top.Prec_split_processb.ArriveB [0,0] eager scheduler [0] BIP_Top.cPeriodicSourceprocessa.Arrive|BIP_Top.cRelDlSinkprocessa.Arrive|BIP_Top.Prec_split_processa.ArriveB [0,0] eager scheduler [0] BIP_Top.cPeriodicSourcesplit.Arrive|BIP_Top.cRelDlSinksplit.Arrive|BIP_Top.Prec_split_processb.ArriveA|BIP_Top.Prec_split_processa.ArriveA [0,0] eager scheduler [0] BIP_Top.cPeriodicSourcesplit.Start|BIP_Top.split_ins.Start|BIP_Top.Prec_split_processb.StartA|BIP_Top.Prec_split_processa.StartA [0,25ms] eager scheduler [1ms211us] BIP_Top.split_ins.DOLC_write_p_PROC_XIF_Write_call|BIP_Top.xmailbox_ins.Write [_,_] eager scheduler [1ms212us] BIP_Top.xmailbox_ins.WriteAck|BIP_Top.split_ins.DOLC_write_return [_,_] eager USER_PROCESS> --------------------------------------- USER_PROCESS> Split wrote 212.000000 scheduler [1ms261us] BIP_Top.split_ins.DOLC_write_p_PROC_XIF_Write2_call|BIP_Top.xmailbox2_ins.Write [_,_] eager scheduler [1ms295us] BIP_Top.xmailbox2_ins.WriteAck|BIP_Top.split_ins.DOLC_write_return [_,_] eager USER_PROCESS> Split wrote(2) 2120.000000 Proposed Design Flow and mapping of the required activities to the work program task structure

Execution on Target Platform core0 cPeriodicSourcesplit core0 cRelDlSinksplit core0 cPeriodicSourceprocessb core0 cRelDlSinkprocessb core0 cPeriodicSourceprocessa core0 cRelDlSinkprocessa core0 xmailbox_ins core0 xmailbox2_ins core0 Prec_split_processb core0 Prec_split_processa core1 split_ins core1 processa_ins core2 processb_ins Multi-Core Platform Task Automata Generator ( Dependency Patterns + Tasks) BIP Compiler Task Model (BIP) BIP RTE SW Model Interface Structure Functional-Code (C) Executing on LEON4 Proposed Design Flow and mapping of the required activities to the work program task structure

Conclusions We have presented a work-in-progress design flow for mixed-critical multi-cores We have prototype tools and examples Future work on multi-cores Finalize support of task migration Finalize integration of offline scheduler and (online) resource manager Integration with WCET measurement tools Full support of bandwidth interference ( beyond  model) Integration with interference measurement tools Handling cache interference Proposed Design Flow and mapping of the required activities to the work program task structure