Dr. Alexandra Fedorova School of Computing Science SFU

Slides:



Advertisements
Similar presentations
Interactive lesson about operating system
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Intel Multi-Core Technology. New Energy Efficiency by Parallel Processing – Multi cores in a single package – Second generation high k + metal gate 32nm.
1 Chapter 1 Why Parallel Computing? An Introduction to Parallel Programming Peter Pacheco.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
1 Undergraduate Curriculum Revision Department of Computer Science February 10, 2010.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Prof. Srinidhi Varadarajan Director Center for High-End Computing Systems.
Chapter1 Fundamental of Computer Design Dr. Bernard Chen Ph.D. University of Central Arkansas.
CS 345 Computer System Overview
Dr. Alexandra Fedorova August 2007 Introduction to Systems Research at SFU.
March 18, 2008SSE Meeting 1 Mary Hall Dept. of Computer Science and Information Sciences Institute Multicore Chips and Parallel Programming.
CMSC 421: Principles of Operating Systems Section 0202 Instructor: Dipanjan Chakraborty Office: ITE 374
CMPT 300: Operating Systems I Dr. Mohamed Hefeeda
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
Revisiting a slide from the syllabus: CS 525 will cover Parallel and distributed computing architectures – Shared memory processors – Distributed memory.
1 School of Computing Science Simon Fraser University CMPT 300: Operating Systems I Dr. Mohamed Hefeeda.
Lock vs. Lock-Free memory Fahad Alduraibi, Aws Ahmad, and Eman Elrifaei.
CSS430 Introduction1 Textbook Ch1 These slides were compiled from the OSC textbook slides (Silberschatz, Galvin, and Gagne) and the instructor’s class.
Threads 1 CS502 Spring 2006 Threads CS-502 Spring 2006.
Introduction  What is an Operating System  What Operating Systems Do  How is it filling our life 1-1 Lecture 1.
SyNAR: Systems Networking and Architecture Group Symbiotic Jobscheduling for a Simultaneous Multithreading Processor Presenter: Alexandra Fedorova Simon.
INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.
Dr. Gheith Abandah, Chair Computer Engineering Department The University of Jordan 20/4/20091.
© 2009 Mathew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 1: Introduction.
Objectives To provide a grand tour of the major operating systems components To provide coverage of basic computer system organization.
1 Lecture 2 Introduction, OS History n objective of an operating system n OS history u no OS u batch system u multiprogramming u multitasking.
Introduction to Parallel Computing. Serial Computing.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Chapter 1. Introduction What is an Operating System? Mainframe Systems
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 1 Introduction Read:
Multi Core Processor Submitted by: Lizolen Pradhan
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
View-Oriented Parallel Programming for multi-core systems Dr Zhiyi Huang World 45 Univ of Otago.
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
Recognizing Potential Parallelism Introduction to Parallel Programming Part 1.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
SYNAR Systems Networking and Architecture Group CMPT 886: Computer Architecture Primer Dr. Alexandra Fedorova School of Computing Science SFU.
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
C o n f i d e n t i a l 1 Course: BCA Semester: III Subject Code : BC 0042 Subject Name: Operating Systems Unit number : 1 Unit Title: Overview of Operating.
University of Washington What is parallel processing? Spring 2014 Wrap-up When can we execute things in parallel? Parallelism: Use extra resources to solve.
1. 2 Pipelining vs. Parallel processing  In both cases, multiple “things” processed by multiple “functional units” Pipelining: each thing is broken into.
 Introduction to SUN SPARC  What is CISC?  History: CISC  Advantages of CISC  Disadvantages of CISC  RISC vs CISC  Features of SUN SPARC  Architecture.
Processor Architecture
1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
SYNAR Systems Networking and Architecture Group CMPT 886: Computer Architecture Primer Dr. Alexandra Fedorova School of Computing Science SFU.
1/50 University of Turkish Aeronautical Association Computer Engineering Department Ceng 541 Introduction to Parallel Computing Dr. Tansel Dökeroğlu
GPU Computing for GIS James Mower Department of Geography and Planning University at Albany.
Introduction to Performance Tuning Chia-heng Tu PAS Lab Summer Workshop 2009 June 30,
Introduction to threads
MPOC: A Chip Multiprocessor for Embedded Systems
Chapter1 Fundamental of Computer Design
What happens inside a CPU?
Chapter 1: Introduction
Operating System Concepts
Subject Name: Operating System Concepts Subject Number:
Embedded Computer Architecture 5SAI0 Chip Multi-Processors (ch 8)
EE 155 / Comp 122 Parallel Computing
Types of Parallel Computers
Presentation transcript:

Dr. Alexandra Fedorova School of Computing Science SFU CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of Computing Science SFU

Meet the Instructor Ph.D. in Computer Science from Harvard, 2006 Dissertation on operating system design for multicore processors Concurrently with Ph.D., an intern at Sun Labs (3 years) 9 US patent applications First semester at SFU: Spring 2007 Industrial partnership with Sun Microsystems

Course Topic Multicore processors Many research problems to solve New type of computer architecture Dominates new processor market Desktops, servers, mobile devices, etc. Almost all chips will be multicore soon Many research problems to solve How to design software for these chips? How to design the chips themselves? How to structure hardware/software interaction?

Today Introduction to multicore processors Examples of research problems Overview of the course

Conventional vs. Multicore L1 cache L1 cache L1 cache L2 cache L2 cache Conventional processor Single core Dedicated caches One thread at a time Multicore processors At least two cores Shared caches Many threads simultaneously

The Multicore Revolution Most new processors are multicore intel.com: Most processors shipped are multicore: 2006: 75% for desktops, 85% for servers 2007: 90% for desktop and mobile, 100% for servers Everyone’s doing it Sun Microsystems Rock, Niagara 1, Niagara 2 IBM Power4, Power5, Power6, Cell AMD Quad Core (Barcelona) Embedded: ARM

Why Multicore? Power consumption is a huge problem Multicore chips potentially produce a lot more computation per unit of power

Superior Performance/Watt Example: Reduce CPU clock frequency by 20% Power consumption reduces by 50%! Put two 0.8 frequency cores on the same chip Get 1.6 times the computation at the same power consumption 0.5x power 0.5x power Core 0 Core 1 0.8x frequency 0.8x frequency L1 cache L1 cache L2 cache

Why Multicore? Increasing processor clock speed (GHz) is inefficient Increase clock speed by 20% Power increases by ≈75% How much does performance increase?

Multicore vs. Unicore Multicore: Single-core: 1.6x throughput increase No power consumption increase Single-core: 1.2x throughput increase 1.75x power increase

Transistors are used for parallelism: multicore processors Transistor density still rising Clock speed isn’t Transistors are used for parallelism: multicore processors Source: Sutter, The Free Lunch is over

Multicore Potential Multicores offer potential to compute more efficiently Applications and systems are not ready to realize that potential What needs to be done? A fundamental shift to parallel programming New ways to manage resources in the operating system

What’s Important to Remember? Massive parallelism Good or bad? Good: We can use processor more efficiently Bad: We don’t know how to make the most out of it. Core 0 Core 1 L1 cache L1 cache L2 cache Shared resources Execution: functional units, queues, register files Memory: L1 cache, L2 cache, interconnects Good or bad? Good: More efficient resource utilization (the reason for multicore) Bad: Contention for resources

Problems Addressed in Research How to manage resource allocation? Operating system solutions Architectural (hardware solution) How to take advantage of parallelism? Make concurrent programming easier (languages, performance tools, etc.) Make concurrent programming automatic (automatic parallelization)

Managing Resource Allocation New OS structures Extensions to hardware architecture Analytical performance modeling New ways to write applications: can the application tell the OS how it uses resources? New algorithms (attention, theoreticians and AI researchers!)

Operating Systems for Multicore Processors B C A is a database application (needs lots of L1 cache) B is a web server (needs lots of L1 cache) C is a cryptographic thread (needs little L1 cache) Core 0 Core 1 L1 cache L1 cache L2 cache Threads running concurrently compete for resources Degree of contention depends on what the threads are doing

Challenges A B C How to find out threads’ resource requirements? How to find out if threads will compete? Core 0 Core 1 L1 cache L1 cache L2 cache How to find out the degree of contention on performance? What is the best way to schedule threads?

Problems Addressed in Research How to manage resource allocation? Operating system solutions Architectural (hardware solution) How to take advantage of parallelism? Make concurrent programming easier (languages, performance tools, etc.) Make concurrent programming automatic (automatic parallelization)

Support for Concurrent Programming Writing parallel code is difficult Most people think serially Deciding how to divide the work between threads is not always trivial Parallel entities need to synchronize or communicate A new paradigm for synchronization

Synchronization Hurts Performance shared data If lock is not available, threads wait Execution becomes serialized

Coarse vs. Fine Synchronization int update_shared_counters(int *counters, int n_counters) { int i; coarse_lock_acquire(counters_lock); for (i=0; i<n_counters; i++) fine_lock_acquire(counter_locks[i]); counters[i]++; fine_lock_release(counter_locks[i]); } coarse_lock_release(counters_lock); Coarse locks are easy to program But perform poorly Fine locks perform well But are difficult to program

Transactional Memory To the Rescue! Can we have the best of both worlds? Good performance Ease of programming The answer is: Transactional Memory (TM)

Transactional Memory (TM) Programming model: Extension to the language Runtime and/or hardware support Lets you do synchronization without locks Performance of fine grained locks Ease of programming of coarse grained locks

Transactional Memory vs. Locks int update_shared_counters(int *counters, int n_counters) { int i; ATOMIC_BEGIN(); coarse_lock_acquire(counters_lock); for (i=0; i<n_counters; i++) fine_lock_acquire(counter_locks[i]); counters[i]++; fine_lock_release(counter_locks[i]); } coarse_lock_release(counters_lock); ATOMIC_END(); Transactional section Looks like coarse grained lock Acts like fine grained lock Performance degrades only if there is conflict

The Backend of TM Abort! restart read A write B read B write A write D read C write C read E write E read D Abort!

State of TM Still evolving It is very real More work needed to make it usable and well performing It is very real Sun’s new Rock processor has TM support Intel is very active

Summary Multicore systems Plenty of research on multicore systems They are everywhere: servers, desktops, small devices Must understand them Plenty of research on multicore systems System software (OS, compilers, runtimes) Architecture Analytical modeling Applications

Break for introductions…

Class Structure Learn about multicore research Read and critique papers Paper summaries, presentations Learn how to do multicore research Discuss papers, think about new ideas Analyze papers Learn how to use research tools (2 homeworks) Do multicore research A research project

Research Project A unique experience: getting a project done from start to end Goal: generate a publication Last year: two publications out of four projects Gives you confidence as a grad student Improves your resume Challenging! You will learn a lot!

Your Expectations Expect to work hard But you’ll be glad you did this later Papers will be difficult to read at first (3-5 hours/paper) Will get easy later Reward: You will be comfortable at leading your own research in this area

Final Project You can create your own topic Or choose from a list of existing topics Some projects are very well specified (like an undergraduate course project) Others are more open-ended (hint: an opportunity to be creative) We have systems and tools you’ll need for the project

Final Project (cont.) Submit a project proposal in early February Complete the project by early April You have only two months Have to work hard! Expect to dedicate ≈15-20 hrs/week

Will I Succeed in this Course? You have to work independently! Take full responsibility for your project I will help, but I cannot do it for you I do not have all the answers You will succeed, if you are prepared to work hard What you can or cannot do now does not matter The course is designed to train you

Course Web Site Syllabus Wiki Multicore portal Technical documentation