© 2004 Wayne Wolf Overheads for Computers as Components 2e Overview zWhy multiprocessors? zThe structure of multiprocessors. zElements of multiprocessors:

Slides:

Advertisements

Similar presentations

Network II.5 simulator ..

Advertisements

Multiple Processor Systems

Multiple Processor Systems

Threads, SMP, and Microkernels

© 2004 Wayne Wolf Topics Task-level partitioning. Hardware/software partitioning.  Bus-based systems.

4. Shared Memory Parallel Architectures 4.4. Multicore Architectures

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Lecture 6: Multicore Systems

WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.

High Performance Embedded Computing © 2007 Elsevier Lecture 15: Embedded Multiprocessor Architectures Embedded Computing Systems Mikko Lipasti, adapted.

Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.

Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Multiprocessors zWhy multiprocessors? zCPUs and accelerators. zMultiprocessor performance.

Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.

11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.

1: Operating Systems Overview

Chapter 7 Hardware Accelerators 金仲達教授清華大學資訊工程學系 (Slides are taken from the textbook slides)

Chapter 13 Embedded Systems

Chapter 11 Operating Systems

Courseware Basics of Real-Time Scheduling Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens Plads, Building.

Real-Time Kernels and Operating Systems. Operating System: Software that coordinates multiple tasks in processor, including peripheral interfacing Types.

© 2008 Wayne Wolf Overheads for Computers as Components 2 nd ed. Processes and operating systems zInterprocess communication. zOperating system performance.

CprE 458/558: Real-Time Systems

Misconceptions About Real-time Computing : A Serious Problem for Next-generation Systems J. A. Stankovic, Misconceptions about Real-Time Computing: A Serious.

High Performance Embedded Computing © 2007 Elsevier Chapter 6, part 1: Multiprocessor Software High Performance Embedded Computing Wayne Wolf.

1 Input/Output Chapter 3 TOPICS Principles of I/O hardware Principles of I/O software I/O software layers Disks Clocks Reference: Operating Systems Design.

1 I/O Management in Representative Operating Systems.

MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.

Module I Overview of Computer Architecture and Organization.

CATA 06© 2006 Wayne Wolf Multiprocessor Systems-on-Chips Wayne Wolf Dept. of Electrical Engineering Princeton University.

ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.

UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.

LOGO OPERATING SYSTEM Dalia AL-Dabbagh

 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.

1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.

Multi-core architectures. Single-core computer Single-core CPU chip.

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Threads and Processes.

Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,

Operating Systems. Definition An operating system is a collection of programs that manage the resources of the system, and provides a interface between.

Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.

Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.

© 2000 Morgan Kaufman Overheads for Computers as Components Processes and operating systems Operating systems. 1.

ECE 526 – Network Processing Systems Design Computer Architecture: traditional network processing systems implementation Chapter 4: D. E. Comer.

A Systematic Approach to the Design of Distributed Wearable Systems Urs Anliker, Jan Beutel, Matthias Dyer, Rolf Enzler, Paul Lukowicz Computer Engineering.

System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.

© 2000 Morgan Kaufman Overheads for Computers as Components Accelerators zAccelerated systems. zSystem design: yperformance analysis; yscheduling and.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.

Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.

CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.

Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.

1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.

Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

1 of 14 Lab 2: Design-Space Exploration with MPARM.

Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.

Embedded Real-Time Systems Processing interrupts Lecturer Department University.

Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.

Lab 9 Multiprocessor, Buses, SPI, I2C. Multiprocessors Why multiprocessors? The structure of multiprocessors. Elements of multiprocessors: – Processing.

Overview Parallel Processing Pipelining

Multiprocessing.

Process Management Process Concept Why only the global variables?

Wayne Wolf Dept. of EE Princeton University

Chapter 4: Multithreaded Programming

CMSC 611: Advanced Computer Architecture

Overview of Computer Architecture and Organization

Threads Chapter 4.

Chapter 2: Operating-System Structures

Chapter 2: Operating-System Structures

Presentation transcript:

© 2004 Wayne Wolf Overheads for Computers as Components 2e Overview zWhy multiprocessors? zThe structure of multiprocessors. zElements of multiprocessors: yProcessing elements. yMemory. yInterconnect.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Why multiprocessing? zTrue parallelism: yTask level. yData level. zMay be necessary to meet real-time requirements.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Multiprocessing and real time zFaster rate processes are isolated on processors. ySpecialized memory system as well. zSlower rate processes are shared on a processor (or processor pool). CPUmem CPUmem print engine File read, Rendering, Etc.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Heterogeneous multiprocessors zWill often have a heterogeneous structure. yDifferent types of PEs. ySpecialized memory structure. ySpecialized interconnect.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Multiprocessor system-on- chip zMultiple processors. yCPUs, DSPs, etc. yHardwired blocks. yMixed-signal. zCustom memory system. zLots of software.

© 2004 Wayne Wolf Overheads for Computers as Components 2e System-on-chip applications zSophisticated markets: yHigh volume. yDemanding performance, power requirements. yStrict price restrictions. zOften standards-driven. zExamples: yCommunications. yMultimedia. yNetworking.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Terminology zPE: processing element. zInterconnection network: may require more than one clock cycle to transfer data. zMessage: address+data packet.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Generic multiprocessor zShared memory:z Message passing: PE mem PE mem PE mem … … Interconnect network PE mem PE mem PE mem … Interconnect network

© 2004 Wayne Wolf Overheads for Computers as Components 2e Shared memory vs. message passing zShared memory and message passing are functionally equivalent. zDifferent programming models: yShared memory more like uniprocessor. yMessage passing good for streaming. zMay have different implementation costs: yInterconnection network.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Shared memory implementation zMemory blocks are in address space. zMemory interface sends messages through network to addressed memory block.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Message passing implementation zProgram provides processor address, data/parameters. yUsually through API. zPacket(s) interface appears as I/O device. yPacket routed through network to interface. zRecipient must decode parameters to determine how to handle the message.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Processing element selection zWhat tasks run on what PEs? ySome tasks may be duplicated (e.g., HDTV motion estimation). ySome processors may run different tasks. zHow does the load change? yStatic vs. dynamic task allocation.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Matching PEs to tasks zFactors: yWord size. yOperand types. yPerformance. yEnergy/power consumption. zHardwired function units: yPerformance. yInterface.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Task allocation zTasks may be created at: yDesign time (video encoder). yRun time (user interface). zTasks may be assigned to processing elements at: yDesign time (predictable load). yRun time (varying load).

© 2004 Wayne Wolf Overheads for Computers as Components 2e Memory system design zUniform vs. heterogeneous memory system. yPower consumption. yCost. yProgramming difficulty. zCaches: yMemory consistency.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Parallel memory systems zTrue concurrency--- several memory blocks can operate simultaneously. PE mem PE mem PE mem … … Interconnect network

© 2004 Wayne Wolf Overheads for Computers as Components 2e Cache consistency zProblem: caches hide memory updates. zSolution: have caches snoop changes. PE mem PE cache network mem

© 2004 Wayne Wolf Overheads for Computers as Components 2e Cache consistency and tasks zTraditional scientific computing maps a single task onto multiple PEs. zEmbedded computing maps different tasks onto multiple PEs. yMay be producer/consumer. yNot all of the memory may need to be consistent.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Network topologies zMajor choices. yBus. yCrossbar. yBuffered crossbar. yMesh. yApplication-specific.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Bus network zAdvantages: yWell-understood. yEasy to program. yMany standards. zDisadvantages: yContention. ySignificant capacitive load.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Crossbar zAdvantages: yNo contention. ySimple design. zDisadvantages: yNot feasible for large numbers of ports.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Buffered crossbar zAdvantages: ySmaller than crossbar. yCan achieve high utilization. zDisadvantages: yRequires scheduling. Xbar

© 2004 Wayne Wolf Overheads for Computers as Components 2e Mesh zAdvantages: yWell-understood. yRegular architecture. zDisadvantages: yPoor utilization.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Application-specific. zAdvantages: yHigher utilization. yLower power. zDisadvantages: yMust be designed. yMust carefully allocate data.

© 2004 Wayne Wolf Overheads for Computers as Components 2e TI OMAP zTargets communications, multimedia. zMultiprocessor with DSP, RISC. C55x DSP OMAP 5910: ARM9 MMU Memory ctrl MPU interface System DMA control bridge I/O

© 2004 Wayne Wolf Overheads for Computers as Components 2e RTOS for multiprocessors zIssues: yMultiprocessor communication primitives. yScheduling policies. zTask scheduling is considerably harder with true concurrency.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Distributed system performance zLongest-path algorithms don’t work under preemption. zSeveral algorithms unroll the schedule to the length of the least common multiple of the periods: yproduces a very long schedule; ydoesn’t work for non-fixed periods. zSchedules based on upper bounds may give inaccurate results.

© 2004 Wayne Wolf Overheads for Computers as Components 2e Data dependencies help zP 3 cannot preempt both P 1 and P 2. zP 1 cannot preempt P 2. P1P1 P2P2 P3P3

© 2004 Wayne Wolf Overheads for Computers as Components 2e Preemptive execution hurts zWorst combination of events for P 5 ’s response time: yP 2 of higher priority yP 2 initiated before P 4 ycauses P 5 to wait for P 2 and P 3. zIndependent tasks can interfere—can’t use longest path algorithms. P1P1 M1M1 P5P5 P2P2 M2M2 P4P4 P3P3 M3M3

© 2004 Wayne Wolf Overheads for Computers as Components 2e Period shifting example zP 2 delayed on CPU 1; data dependency delays P 3 ; priority delays P 4. Worst-case t 3 delay is 80, not 50. taskperiod   2 70  processCPU time P 1 30 P 2 10 P 3 30 P 4 20 CPU 1 P1P1 P2P2 CPU 2 P3P3 P4P4 P2P2 P3P3 P4P4 P1P1 P2P2 P4P4 P3P3 11 22 33