CMT OS scheduling summary Yipkei Kwok 03/18/2008.

Slides:



Advertisements
Similar presentations
The Interaction of Simultaneous Multithreading processors and the Memory Hierarchy: some early observations James Bulpin Computer Laboratory University.
Advertisements

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design Hikmet Aras
PERFORMANCE ANALYSIS OF MULTIPLE THREADS/CORES USING THE ULTRASPARC T1 (NIAGARA) Unique Chips and Systems (UCAS-4) Dimitris Kaseridis & Lizy K. John The.
Exploiting Unbalanced Thread Scheduling for Energy and Performance on a CMP of SMT Processors Matt DeVuyst Rakesh Kumar Dean Tullsen.
Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures J. Winter and D. Albonesi, Cornell University International Conference on Dependable.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
Dr. Alexandra Fedorova August 2007 Introduction to Systems Research at SFU.
CS 7810 Lecture 20 Initial Observations of the Simultaneous Multithreading Pentium 4 Processor N. Tuck and D.M. Tullsen Proceedings of PACT-12 September.
Chapter Hardwired vs Microprogrammed Control Multithreading
ECE 510 Brendan Crowley Paper Review October 31, 2006.
SyNAR: Systems Networking and Architecture Group Symbiotic Jobscheduling for a Simultaneous Multithreading Processor Presenter: Alexandra Fedorova Simon.
By- Jaideep Moses, Ravi Iyer , Ramesh Illikkal and
Computer System Architectures Computer System Software
Ioana Burcea Initial Observations of the Simultaneous Multithreading Pentium 4 Processor Nathan Tuck and Dean M. Tullsen.
SYNAR Systems Networking and Architecture Group Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures Daniel Shelepov and Alexandra.
1 Previous lecture review n Out of basic scheduling techniques none is a clear winner: u FCFS - simple but unfair u RR - more overhead than FCFS may not.
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Nicolas Tjioe CSE 520 Wednesday 11/12/2008 Hyper-Threading in NetBurst Microarchitecture David Koufaty Deborah T. Marr Intel Published by the IEEE Computer.
(1) Scheduling for Multithreaded Chip Multiprocessors (Multithreaded CMPs)
Heterogeneous Chip Multiprocessor Design for Virtual Machines Dan Upton and Kim Hazelwood University of Virginia.
Real-Time Systems Mark Stanovich. Introduction System with timing constraints (e.g., deadlines) What makes a real-time system different? – Meeting timing.
1 Process Scheduling in Multiprocessor and Multithreaded Systems Matt Davis CS5354/7/2003.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Process Scheduling III ( 5.4, 5.7) CPE Operating Systems
Performance of Multithreaded Chip Multiprocessors and Implications for Operating System Design Hikmet Aras
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Operating Systems: Internals and Design Principles
Scheduling Issues on a Heterogeneous Single ISA Multicore IRISA, France Robert Guziolowski, André Seznec. Contact: 1. M. Becchi and P.
Exploiting Unbalanced Thread Scheduling for Energy and Performance on a CMP of SMT Processors Authors: Matthew DeVuyst, Rakesh Kumar, and Dean M. Tullsen.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Advanced Computer Architecture pg 1 Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8) Henk Corporaal
Chapter 1 — Computer Abstractions and Technology — 1 Uniprocessor Performance Constrained by power, instruction-level parallelism, memory latency.
Computer Structure 2015 – Intel ® Core TM μArch 1 Computer Structure Multi-Threading Lihu Rappoport and Adi Yoaz.
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Chip Level Multithreading (CMT) By:- Tanveer Ahmed.
Performance Model for Future Multicore Process Designs Yipkei Kwok 02/06/2008.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter 4: Multithreaded Programming
Thread & Processor Scheduling
From Algorithm to System to Cloud Computing
Computer Structure Multi-Threading
Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
Morgan Kaufmann Publishers
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
Some challenges in heterogeneous multi-core systems
Chapter 4: Threads.
Adaptive Single-Chip Multiprocessing
Process scheduling Chapter 5.
(A Research Proposal for Optimizing DBMS on CMP)
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
Chapter 4: Threads & Concurrency
Process Management -Compiled for CSIT
Lecture 20 Parallel Programming CSE /27/2019.
Presentation transcript:

CMT OS scheduling summary Yipkei Kwok 03/18/2008

Homogeneous Scheduling Where are we now? –Fairness/performance isolation Cache-fair thread scheduling for multicore processors Performance of multithreaded chip multiprocessors and implications for operating system design –Achievement 24% reduction in performance variability

Homogeneous Scheduling Where are we now? –Performance optimization A Non-Work-Conserving Operating System Scheduler For SMT Processors Symbiotic Job Scheduling for a Simultaneous Multithreading Architecture –Achievement Performance improvement for up to 45%

Homogeneous Scheduling Un-investigated problem –Most techniques are developed for small scale machines (e.g. 18 threads for SPEC2000 and 8 hardware contexts) –Work well on large scale CMT machines? –Some techniques requires virtually no overhead w./ hardware support. – fine –Some co-scheduling techniques examine nCr combinations (n=no. of workloads, r=degree of parallelism) –New Moore’s law (joke): no. of cores doubles every 18 months.

Homogeneous Scheduling Un-solved problem –Desktop workload optimization Why is it hard? –Short running time –Unpredictable behavior –Limited level of parallelism –Light Desktop workloads becoming heavier and long-running App-hinted OS scheduling –Thread Scheduling for Multi-Core Platforms. Mohan Rajagopalan Brian T. Lewis Todd A. Anderson. HotOS '07. App developers forces to explore parallelism.

Heterogeneous Scheduling Where are we now? –Couple papers –Limited to cores w./ same ISA but different speeds –Can you find one in the market?

Heterogeneous Scheduling Scheduling objectives (same ISA) –Performance optimization –Fair CPU sharing –Balanced core assignment Scheduling objectives (different ISA) –Unidentified –Balanced core assignment not applicable –How do you avoid performance jitters?

Heterogeneous Scheduling More problems –Preferred core dilemma –Follow/not follow when the preferred core has a long queue? Follow -> long response time Not follow -> sub-optimal performance –How does the decision affect the overall performance?