Marco Paolieri RePP Workshop October 15 th 1 Efficient Execution of Mixed Application Workloads in a Hard Real-Time Multicore System Marco Paolieri (BSC/UPC)

Slides:



Advertisements
Similar presentations
© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.
Advertisements

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Static Bus Schedule aware Scratchpad Allocation in Multiprocessors Sudipta Chattopadhyay Abhik Roychoudhury National University of Singapore.
Modeling shared cache and bus in multi-core platforms for timing analysis Sudipta Chattopadhyay Abhik Roychoudhury Tulika Mitra.
4/17/20151 Improving Memory Bank-Level Parallelism in the Presence of Prefetching Chang Joo Lee Veynu Narasiman Onur Mutlu* Yale N. Patt Electrical and.
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Multicore Architecture for Critical Real-Time Embedded Systems Multicores in CRTEs: Critical Real-Time Embedded Systems (CRTESs) are in everyday life CRTESs.
MemGuard: Memory Bandwidth Reservation System for Efficient Performance Isolation in Multi-core Platforms Apr 9, 2012 Heechul Yun +, Gang Yao +, Rodolfo.
Missed Deadline Notification in Best-Effort Schedulers Balaji Raman.
REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.
1 Multi-Core Systems CORE 0CORE 1CORE 2CORE 3 L2 CACHE L2 CACHE L2 CACHE L2 CACHE DRAM MEMORY CONTROLLER DRAM Bank 0 DRAM Bank 1 DRAM Bank 2 DRAM Bank.
Oct. 15, 2009RePP Reconciling Predictability with Performance - the questions we asked - Reinhard Wilhelm Saarland University.
1 Virtual Private Caches ISCA’07 Kyle J. Nesbit, James Laudon, James E. Smith Presenter: Yan Li.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
1 of 14 1/15 Schedulability Analysis and Optimization for the Synthesis of Multi-Cluster Distributed Embedded Systems Paul Pop, Petru Eles, Zebo Peng Embedded.
Scheduling with Optimized Communication for Time-Triggered Embedded Systems Slide 1 Scheduling with Optimized Communication for Time-Triggered Embedded.
Holistic Scheduling and Analysis of Mixed Time/Event-Triggered Distributed Embedded System Traian Pop, Petru Eles, Zebo Peng EE249 Discussion Paper Review.
Predictable Implementation of Real-Time Applications on Multiprocessor Systems-on-Chip Alexandru Andrei, Petru Eles, Zebo Peng, Jakob Rosen Presented By:
HARDWARE SUPPORT FOR REAL TIME OPERATING SYSTEMS A presentation by: Jake Swart.
Computer Science 12 Design Automation for Embedded Systems ECRTS 2011 Bus-Aware Multicore WCET Analysis through TDMA Offset Bounds Timon Kelter, Heiko.
Chapter 1 Embedded And Real-Time System Department of Computer Science Hsu Hao Chen Professor Hsung-Pin Chang.
Improving Real-Time Performance on Multicore Platforms Using MemGuard University of Kansas Dr. Heechul Yun 10/28/2013.
Exploring the Tradeoffs of Configurability and Heterogeneity in Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable.
Low Contention Mapping of RT Tasks onto a TilePro 64 Core Processor 1 Background Introduction = why 2 Goal 3 What 4 How 5 Experimental Result 6 Advantage.
Stall-Time Fair Memory Access Scheduling Onur Mutlu and Thomas Moscibroda Computer Architecture Group Microsoft Research.
1 Reducing Queue Lock Pessimism in Multiprocessor Schedulability Analysis Yang Chang, Robert Davis and Andy Wellings Real-time Systems Research Group University.
Euro-Par, A Resource Allocation Approach for Supporting Time-Critical Applications in Grid Environments Qian Zhu and Gagan Agrawal Department of.
Real-Time Systems Mark Stanovich. Introduction System with timing constraints (e.g., deadlines) What makes a real-time system different? – Meeting timing.
The Global Limited Preemptive Earliest Deadline First Feasibility of Sporadic Real-time Tasks Abhilash Thekkilakattil, Sanjoy Baruah, Radu Dobrin and Sasikumar.
Zheng Wu. Background Motivation Analysis Framework Intra-Core Cache Analysis Cache Conflict Analysis Optimization Techniques WCRT Analysis Experiment.
Relaxing the Synchronous Approach for Mixed-Criticality Systems
IMPACT OF CACHE PARTITIONING ON MULTI-TASKING REAL TIME EMBEDDED SYSTEMS Presentation by: Eric Magil Research by: Bach D. Bui, Marco Caccamo, Lui Sha,
1 Presented By: Michael Bieniek. Embedded systems are increasingly using chip multiprocessors (CMPs) due to their low power and high performance capabilities.
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
MIAO ZHOU, YU DU, BRUCE CHILDERS, RAMI MELHEM, DANIEL MOSSÉ UNIVERSITY OF PITTSBURGH Writeback-Aware Bandwidth Partitioning for Multi-core Systems with.
A Unified WCET Analysis Framework for Multi-core Platforms Sudipta Chattopadhyay, Chong Lee Kee, Abhik Roychoudhury National University of Singapore Timon.
Exploiting Scratchpad-aware Scheduling on VLIW Architectures for High-Performance Real-Time Systems Yu Liu and Wei Zhang Department of Electrical and Computer.
Impact of Power-Management Granularity on The Energy-Quality Trade-off for Soft And Hard Real-Time Applications International Symposium on System-on-Chip,
1 Real-Time Scheduling. 2Today Operating System task scheduling –Traditional (non-real-time) scheduling –Real-time scheduling.
Static WCET Analysis vs. Measurement: What is the Right Way to Assess Real-Time Task Timing? Worst Case Execution Time Prediction by Static Program Analysis.
Parallelism-Aware Batch Scheduling Enhancing both Performance and Fairness of Shared DRAM Systems Onur Mutlu and Thomas Moscibroda Computer Architecture.
Real-time aspects Bernhard Weirich Real-time Systems Real-time systems need to accomplish their task s before the deadline. – Hard real-time:
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
1 of 14 1/15 Schedulability-Driven Frame Packing for Multi-Cluster Distributed Embedded Systems Paul Pop, Petru Eles, Zebo Peng Embedded Systems Lab (ESLAB)
Determining Optimal Processor Speeds for Periodic Real-Time Tasks with Different Power Characteristics H. Aydın, R. Melhem, D. Mossé, P.M. Alvarez University.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
Optimization of Time-Partitions for Mixed-Criticality Real-Time Distributed Embedded Systems Domițian Tămaș-Selicean and Paul Pop Technical University.
Prabhat Kumar Saraswat Paul Pop Jan Madsen
Copyright ©: Nahrstedt, Angrave, Abdelzaher
A Requests Bundling DRAM Controller for Mixed-Criticality System
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Improved schedulability on the ρVEX polymorphic VLIW processor
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
Lecture 21: Introduction to Process Scheduling
OverView of Scheduling
3: CPU Scheduling Basic Concepts Scheduling Criteria
Chapter 6: CPU Scheduling
Chapter 6: CPU Scheduling
Lecture 21: Introduction to Process Scheduling
Operating System , Fall 2000 EA101 W 9:00-10:00 F 9:00-11:00
Hardik Shah, Kai Huang and Alois Knoll
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Maria Méndez Real, Vincent Migliore, Vianney Lapotre, Guy Gogniat
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Presentation transcript:

Marco Paolieri RePP Workshop October 15 th 1 Efficient Execution of Mixed Application Workloads in a Hard Real-Time Multicore System Marco Paolieri (BSC/UPC) Eduardo Quiñones (BSC) Francisco J. Cazorla (BSC) Mateo Valero (BSC/UPC) RePP Workshop Grenoble, 15 th October

Marco Paolieri RePP Workshop October 15 th 2 Future Real-Time Systems  Current real-time embedded systems require higher performance than provided by current processors  Increasing safety, comfort, number and quality of services

Marco Paolieri RePP Workshop October 15 th 3 Architecture What about predictability?

Marco Paolieri RePP Workshop October 15 th 4 Multicores in RTESs: disadvantages  It is harder to perform WCET analysis for multicore processors than for single-core because of Inter-thread Interferences  Inter-thread interferences accessing shared resources make the execution time vary  Execution time, and so the WCET of a HRT depend on the mixed application workload WCET a d Where: WCET a WCET est without interferences ddeadline ET a,b ET of a running with b ET a,c ET of a running with c ET a,b ET a,c deadline miss Is it possible to use multicores to execute mixed application workloads?

Marco Paolieri RePP Workshop October 15 th 5 Our Goal  Execute efficiently mixed application workload providing  predictability to HRTs  maximizing performance of NHRTs with the resources not used by HRTs

Marco Paolieri RePP Workshop October 15 th 6 Regarding HRTs  Our multicore architecture guarantees by design that the maximum delay a request accessing a shared resource may suffer due to inter-thread interferences has an Upper Bound Delay (UBD)  Inter-thread interferences < UBD  Round Robin provides UBD based on the number of requestors UBD = (N HRT – 1) * L BUS where L BUS is the latency of the bus and N HRT the total number of HRTs running at the same time  The impact of inter-thread interferences on WCET is up to 40% [Paolieri et al. Hardware Support for WCET Analysis of Multicore Hard Real-Time Systems, ISCA’09 ]

Marco Paolieri RePP Workshop October 15 th 7 What about Non Real-Time Tasks?  HRTs are accessing shared resources as soon as they are available, what we call Average-Case Resource Managament (AC-RM)  NHRTs could be starving before accessing shared resources  To execute efficiently a mixed application workload  It is required to maximize the perfomance of NHRTs

Marco Paolieri RePP Workshop October 15 th 8 Our Proposal  No advantages if executing the HRTs before their WCET  We evaluate a resource management policy, called Worst- Case Resource Management (WC-RM)  Every access to a shared resource from a HRT is stalled by UBD cycles  This forces HRTs to be executed closer to their WCET  It provides performance to NHRTs still guaranteeing that HRTs meet their deadlines  Reducing the variance between average case execution time and WCET performances of NHRTs can increase

Marco Paolieri RePP Workshop October 15 th 9 How It Works

Marco Paolieri RePP Workshop October 15 th 10 Experimental Setup  As HRT benchmarks we use:  A real HRT application provided by Honeywell  3D collision avoidance algorithm  EEMBC Automotive  As NHRT benchmarks we use  MediaBench, MiBench, SPEC CPU 2006

Marco Paolieri RePP Workshop October 15 th 11 Results  Baseline: throughput of NHRTs using the AC-RM 1

Marco Paolieri RePP Workshop October 15 th 12 Effect on HRTs  Baseline: Performance of HRTs using AC-RM  WC-RM is between 1 and 2 % close to its WCET estimation 1

Marco Paolieri RePP Workshop October 15 th 13 Conclusions  Our architecture is designed from a WCET point of view  It implements a hardware shared-resource management policy, called Worst-Case Resource Management (WC-RM)  It improves the performance level of NHRTs when running in a mixed application workload

Marco Paolieri RePP Workshop October 15 th 14 Thanks for the attention!