Institut für Datentechnik und Kommunikationetze Analysis of Shared Coprocessor Accesses in MPSoCs Overview Bologna, 22.05.2006 Simon Schliecker Matthias.

Slides:



Advertisements
Similar presentations
Real Time Scheduling.
Advertisements

Chapter 7 - Resource Access Protocols (Critical Sections) Protocols: No Preemptions During Critical Sections Once a job enters a critical section, it cannot.
Priority Inheritance and Priority Ceiling Protocols
Priority INHERITANCE PROTOCOLS
Static Bus Schedule aware Scratchpad Allocation in Multiprocessors Sudipta Chattopadhyay Abhik Roychoudhury National University of Singapore.
Modeling shared cache and bus in multi-core platforms for timing analysis Sudipta Chattopadhyay Abhik Roychoudhury Tulika Mitra.
CS5270 Lecture 31 Uppaal, and Scheduling, and Resource Access Protocols CS 5270 Lecture 3.
CSE 522 Real-Time Scheduling (3)
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
Is It Time Yet? Wing On Chan. Distributed Systems – Chapter 18 - Scheduling Hermann Kopetz.
1 Friday, June 16, 2006 "In order to maintain secrecy, this posting will self-destruct in five seconds. Memorize it, then eat your computer." - Anonymous.
The Performance of Spin Lock Alternatives for Shared-Memory Microprocessors Thomas E. Anderson Presented by David Woodard.
1 Institut für Datentechnik und Kommunikationetze Sensitivity Analysis & System Robustness Maximization Short Overview Bologna, Arne Hamann.
Pipelining II Andreas Klappenecker CPSC321 Computer Architecture.
Chapter 6: CPU Scheduling. 5.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 2, 2005 Chapter 6: CPU Scheduling Basic.
© Andy Wellings, 2004 Roadmap  Introduction  Concurrent Programming  Communication and Synchronization  Completing the Java Model  Overview of the.
3.5 Interprocess Communication
Pipelined Processor II CPSC 321 Andreas Klappenecker.
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
UCDavis, ecs251 Fall /23/2007ecs251, fall Operating System Models ecs251 Fall 2007 : Operating System Models #3: Priority Inversion Dr. S.
Introduction to Embedded Systems
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Introduction to Embedded Systems Rabie A. Ramadan 6.
Threads in Java. History  Process is a program in execution  Has stack/heap memory  Has a program counter  Multiuser operating systems since the sixties.
Top Level View of Computer Function and Interconnection.
Real Time Operating Systems Lecture 10 David Andrews
1 Reducing Queue Lock Pessimism in Multiprocessor Schedulability Analysis Yang Chang, Robert Davis and Andy Wellings Real-time Systems Research Group University.
Games Development 2 Concurrent Programming CO3301 Week 9.
Real Time Scheduling Telvis Calhoun CSc Outline Introduction Real-Time Scheduling Overview Tasks, Jobs and Schedules Rate/Deadline Monotonic Deferrable.
Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.
EEE440 Computer Architecture
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
Synchronization and semaphores Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Virtual Memory The memory space of a process is normally divided into blocks that are either pages or segments. Virtual memory management takes.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
ECE291 Computer Engineering II Lecture 15 Dr. Zbigniew Kalbarczyk University of Illinois at Urbana- Champaign.
1 VxWorks 5.4 Group A3: Wafa’ Jaffal Kathryn Bean.
Outline Cache writes DRAM configurations Performance Associative caches Multi-level caches.
1 Real-Time Scheduling. 2Today Operating System task scheduling –Traditional (non-real-time) scheduling –Real-time scheduling.
CSCI1600: Embedded and Real Time Software Lecture 24: Real Time Scheduling II Steven Reiss, Fall 2015.
Silberschatz and Galvin  Chapter 3:Processes Processes –State of a process, process control block, –Scheduling of processes  Long term scheduler,
Introduction to Embedded Systems Rabie A. Ramadan 5.
Queuing Delay 1. Access Delay Some protocols require a sender to “gain access” to the channel –The channel is shared and some time is used trying to determine.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Chapter 3 System Buses.  Hardwired systems are inflexible  General purpose hardware can do different tasks, given correct control signals  Instead.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 4: Processes Process Concept Process Scheduling Types of shedulars Process.
CE Operating Systems Lecture 8 Process Scheduling continued and an introduction to process synchronisation.
Timing Anomalies in Dynamically Scheduled Microprocessors Thomas Lundqvist, Per Stenstrom (RTSS ‘99) Presented by: Kaustubh S. Patil.
1 load [2], [9] Transfer contents of memory location 9 to memory location 2. Illegal instruction.
Distributed Process Scheduling- Real Time Scheduling Csc8320(Fall 2013)
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
REAL-TIME OPERATING SYSTEMS
Address – 32 bits WRITE Write Cache Write Main Byte Offset Tag Index Valid Tag Data 16K entries 16.
Process Synchronization: Semaphores
CSC 4250 Computer Architectures
EEE 6494 Embedded Systems Design
Timewarp Elias Muche.
Midterm review: closed book multiple choice chapters 1 to 9
Outline Module 1 and 2 dealt with processes, scheduling and synchronization Next two modules will deal with memory and storage Processes require data to.
Concurrency: Mutual Exclusion and Process Synchronization
The End Of The Line For Static Cyclic Scheduling?
Lecture 18: Coherence and Synchronization
Chapter 3: Process Management
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Institut für Datentechnik und Kommunikationetze Analysis of Shared Coprocessor Accesses in MPSoCs Overview Bologna, Simon Schliecker Matthias Ivers Rolf Ernst

2 Institut für Datentechnik und Kommunikationetze System Setup Streaming Processors Shared Memory / Coprocessor Control Process

3 Institut für Datentechnik und Kommunikationetze New Task Model Classical Task-model Read all data when activated output all data when finished Read some data when activated Access shared resource multiple times during execution. Synchronization still only at task activation. Output some data when finished “Communicating Task”-model Core Execution Time Set of incurred transactions

4 Institut für Datentechnik und Kommunikationetze Effects on Coprocessor Accesses WCRT t Execution of Task A on the CPU rescheduling effects Bus Transfer (e.g. volume- dependan t ) Coprocessor (e.g. load- dependant ) byte x = 10; long y = 10; z = COP(y); x = x * z; y = y / z; z = COP(x); x = x * z; y = y / z; z = x + y; Load from other system parts Executed Instructions

5 Institut für Datentechnik und Kommunikationetze Analysis Steps Streaming Processors Shared Memory / Coprocessor Control Process Output Event Models Request Latencies Local Response Times

6 Institut für Datentechnik und Kommunikationetze Analysis Distribution System levelLocal level Task Analyses Response Time Analysis Output Traffic Analysis Task Analyses Response Time Analysis Output Traffic Analysis Request Latency Analysis Task Analyses Response Time Analysis Output Traffic Analysis modifications for communicating tasks

7 Institut für Datentechnik und Kommunikationetze Correlated Event Total Busy Times Not every request experiences the worst case in dynamic systems! Possible solution: assume “worst case for every request” Problem: Can not predict exact request times! interference request sender variable times of interference +variable times of request highly variable and sensitive request latencies! interference request sender

8 Institut für Datentechnik und Kommunikationetze Correlated Event Total Busy Times Deriving individual bounds is difficult Using individual bounds for analysis is difficult but ignoring it is easy! Possible solution: assume “worst case for every request” interference request sender Better solution: approximate “maximum total busy time” interference request sender no remaining overestimation given interference with large jitter! total interference and all requests in total time window  “total busy time”

9 Institut für Datentechnik und Kommunikationetze 3a. Derive possible transactions 1. Specify environmental model 3b. Calculate transaction latencies 3c. Derive Task RT and output-EM output event models 2. Distribute input event models Until convergence or non-schedulability Extended Component Analysis ≠ = Component Analysis 4. Compare to input assumptions

10 Institut für Datentechnik und Kommunikationetze Response Time (if processor stalls) Observation 1: Memory accesses delay task response (can be handled by increase of CET) Observation 2: Priority inversion is possible during a memory accesses Bounded by maximum access time ( ) T1 T2 T3 T1 T2 T3

11 Institut für Datentechnik und Kommunikationetze Scheduling Analysis If processor stalls during Memory requests: –Processor is NOT released, this extends CET. –Higher priority tasks can be blocked by maximum memory access time. –Buffer is always empty, because previous requests finished. CET and. CoP times activations of hp task CET and CoP times Total blocking time

12 Institut für Datentechnik und Kommunikationetze Scheduling Analysis (2) The more requests are considered together, the smaller the overestimation! –Collect all requests that can lead to delay and add maximum total busy time –Perfect match for improved path latencies Total CoP times that can lead to delay of task i

13 Institut für Datentechnik und Kommunikationetze Multithreading from real time perspective Generally: –stalling decreases and –processor utilization increases But: Additional interference for requests –interference from previous requests can completely compensate the gain of reduced stalling! –A task can be fully delayed by higher priority execution and requests –FCFS ordering along request chain counters priorities on sending resource  no gain for response time under given task assumptions sending resource FCFS request processing

14 Institut für Datentechnik und Kommunikationetze Additional Critical Sections How much blocking time to take into account? –Blocking Memory Accesses can be “nested” into Critical Sections (not the other way around) –Assume a virtual semaphore “memory”: All tasks require “memory” to be free to start executing Some tasks spend no time accessing “memory”, but still must wait until it is free Other tasks access “memory”, and may enter the critical section multiple times –Memory Accesses are “automatically” protected with highest priority! –Problem mapped to “nested critical sections problem” (Sha, Rajkumar) Depends on utilized protocol PIP, PCP T1 T2 Task in critical section high priority task blocked!