Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 4 -- Spring 2001.

Slides:



Advertisements
Similar presentations
L.N. Bhuyan Adapted from Patterson’s slides
Advertisements

Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Cache Optimization Summary
Virtual Memory Operating System Concepts chapter 9 CS 355
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
Chapter 101 Virtual Memory Chapter 10 Sections and plus (Skip:10.3.2, 10.7, rest of 10.8)
Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 5 -- Spring 2001.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
CS252/Patterson Lec /23/01 CS213 Parallel Processing Architecture Lecture 7: Multiprocessor Cache Coherency Problem.
CS 104 Introduction to Computer Science and Graphics Problems
CMPT 300: Operating Systems Review THIS REIVEW SHOULD NOT BE USED AS PREDICTORS OF THE ACTUAL QUESTIONS APPEARING ON THE FINAL EXAM.
Memory Management 2010.
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
Distributed Resource Management: Distributed Shared Memory
CS 551-Memory Management1 Learning Objectives Centralized Memory Management -review Simple Memory Model Shared Memory Model Distributed Shared Memory Memory.
03/17/2008CSCI 315 Operating Systems Design1 Virtual Memory Notice: The slides for this lecture have been largely based on those accompanying the textbook.
CS252/Patterson Lec /28/01 CS 213 Lecture 10: Multiprocessor 3: Directory Organization.
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed Shared Memory.
Spring 2003CSE P5481 Cache Coherency Cache coherent processors reading processor must get the most current value most current value is the last write Cache.
Distributed Shared Memory Systems and Programming
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
1 Cache coherence CEG 4131 Computer Architecture III Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Shared Address Space Computing: Hardware Issues Alistair Rendell See Chapter 2 of Lin and Synder, Chapter 2 of Grama, Gupta, Karypis and Kumar, and also.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
July 30, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 8: Exploiting Memory Hierarchy: Virtual Memory * Jeremy R. Johnson Monday.
CIS250 OPERATING SYSTEMS Memory Management Since we share memory, we need to manage it Memory manager only sees the address A program counter value indicates.
Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 3 -- Spring 2001.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
CS425/CSE424/ECE428 – Distributed Systems Nikita Borisov - UIUC1 Some material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra,
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Distributed Shared Memory Based on Reference paper: Distributed Shared Memory, Concepts and Systems.
Cache Coherence Protocols 1 Cache Coherence Protocols in Shared Memory Multiprocessors Mehmet Şenvar.
DISTRIBUTED COMPUTING
Lecture Topics: 11/24 Sharing Pages Demand Paging (and alternative) Page Replacement –optimal algorithm –implementable algorithms.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Distributed shared memory u motivation and the main idea u consistency models F strict and sequential F causal F PRAM and processor F weak and release.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
LECTURE 12 Virtual Memory. VIRTUAL MEMORY Just as a cache can provide fast, easy access to recently-used code and data, main memory acts as a “cache”
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
Region-Based Software Distributed Shared Memory Song Li, Yu Lin, and Michael Walker CS Operating Systems May 1, 2000.
The University of Adelaide, School of Computer Science
Distributed Shared Memory
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 18: Coherence and Synchronization
12.4 Memory Organization in Multiprocessor Systems
Ivy Eva Wu.
The University of Adelaide, School of Computer Science
CMSC 611: Advanced Computer Architecture
The University of Adelaide, School of Computer Science
Outline Midterm results summary Distributed file systems – continued
Chapter 5 Exploiting Memory Hierarchy : Cache Memory in CMP
Concurrency: Mutual Exclusion and Process Synchronization
Lecture 25: Multiprocessors
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
High Performance Computing
CSE 451: Operating Systems Autumn 2003 Lecture 10 Paging & TLBs
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture: Coherence Topics: wrap-up of snooping-based coherence,
Distributed Resource Management: Distributed Shared Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture 19: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
CSE 542: Operating Systems
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 4 -- Spring 2001

21 February 2001CS-551, Lecture 42 CS551: Lecture 4 n Topics – Memory Management n Simple n Shared n Distributed n Migration – Concurrency Control n Mutex and Critical Regions n Semaphores n Monitors

21 February 2001CS-551, Lecture 43 Centralized Memory Management n Review – Memory: cache, RAM, auxiliary – Virtual Memory – Pages and Segments n Internal/External Fragmentation – Page Replacement Algorithm n Page Faults => Thrashing n FIFO, NRU, LRU n Second Chance; Lazy (Dirty Pages)

21 February 2001CS-551, Lecture 44 Figure 4.1 Fragmentation in Page-Based Memory versus a Segment-Based Memory. (Galli, p.83)

21 February 2001CS-551, Lecture 45 Figure 4.2 Algorithms for Choosing Segment Location. (Galli,p.84)

21 February 2001CS-551, Lecture 46 Simple Memory Model n Used in parallel NUMA systems – Access times equal for all processors – Too many processors n => thrashing n => need for lots of memory n High performance parallel computers – May not use cache -- to avoid overhead – May not use virtual memory

21 February 2001CS-551, Lecture 47 Shared Memory Model n Shared memory can be a means of interprocess communication n Virtual memory with multiple physical memories, caches, and secondary storage n Easy to partition data for parallel processing n Easy migration for load balancing n Example systems: – Amoeba: shared segments on same system – Unix System V: sys/shm.h

21 February 2001CS-551, Lecture 48 Shared Memory via Bus P1P2 P7P8 Shared Memory P9P10 P5P4P3 P6

21 February 2001CS-551, Lecture 49 Shared Memory Disadvantages n All processors read/write common memory – Requires concurrency control n Processors may be linked by a bus – Too much memory activity may cause bus contention – Bus can be a bottleneck n Each processor may have own cache – => cache coherency (consistency) problems – Snoopy (snooping) cache is a solution

21 February 2001CS-551, Lecture 410 Bused Shared Memory w/Caches P1P2 P7P8 Shared Memory P9P10 P5P4P3 P6 cache

21 February 2001CS-551, Lecture 411 Shared Memory Performance n Try to overlap communication and computation n Try to prefetch data from memory n Try to migrate processes to processors that hold needed data in local memory – Page scanner n Bused shared memory does not scale well – More processors => bus contention – Faster processors => bus contention

21 February 2001CS-551, Lecture 412 Figure 4.3 Snoopy Cache. (Galli,p.89)

21 February 2001CS-551, Lecture 413 Cache Coherency (Consistency) n Want local caches to have consistent data – If two processor caches contain same data, the data should have the same value – If not, caches are not coherent n But what if one/both processors change the data value? – Mark modified cache value as dirty – Snoopy cache picks up new value as it is written to memory

21 February 2001CS-551, Lecture 414 Cache Consistency Protocols n Write-through protocol n Write-back protocol n Write-once protocol – Cache block invalid, dirty, or clean – Cache ownership – All caches snoop – Protocol part of MMU – Performs within a memory cycle

21 February 2001CS-551, Lecture 415 Write-through protocol Read-miss n Fetch data from memory to cache Read hit n Fetch data from local cache Write miss n Update data in memory and store in cache Write hit n Update memory and cache n Other local processors invalidate cache entry

21 February 2001CS-551, Lecture 416 Distributed Shared Memory n NUMA – Global address space – All memories together form one global memory – True multiprocessors – Maintains directory service n NORMA – Specialized message-passing network – Example: workstations on a LAN

21 February 2001CS-551, Lecture 417 Distributed Shared Memory P1P2 P7P8P10P11 P5P4P3 P6 memory P9 memory

21 February 2001CS-551, Lecture 418 How to distribute shared data? n How to distribute shared data? n How many readers and writers are allowed for a given set of data? n Two approaches – Replication n Data copied to different processors that need it – Migration n Data moved to different processors that need it

21 February 2001CS-551, Lecture 419 Single Reader / Single Writer n No concurrent use of shared data n Data use may be a bottleneck n Static

21 February 2001CS-551, Lecture 420 Multiple Reader /Single Writer n Readers may have a invalid copy after the writer writes a new value n Protocol must have an invalidation method n Copy set: list of processors that have a copy of a memory location n Implementation – centralized, distributed, or combination

21 February 2001CS-551, Lecture 421 Centralized MR/SW n One server – Processes all requests – Maintains all data and data locations n Increases traffic near server – Potential bottleneck n Server must perform more work than others – Potential bottleneck

21 February 2001CS-551, Lecture 422 Figure 4.4 Centralized Server for Multiple Reader/Single Writer DSM. (Galli,p.92)

21 February 2001CS-551, Lecture 423 Partially distributed centralization of MR/SW n Distribution of data static n One server receives all requests n Requests sent to processor with desired data – Handles requests – Notifies readers of invalid data

21 February 2001CS-551, Lecture 424 Figure 4.5 Partially Distributed Invalidation for Multiple Reader/Single Writer DSM. (Galli, p.92) [Read X as C below]

21 February 2001CS-551, Lecture 425 Dynamic distributed MR/SW n Data may move to different processor n Send broadcast message for all requests in order to reach current owner of data n Increases number of messages in system – More overhead – More work for entire system

21 February 2001CS-551, Lecture 426 Ffigure 4.6 Dynamic Distributed Multiple Reader/Single Writer DSM. (Galli, P.93)

21 February 2001CS-551, Lecture 427 A Static Distributed Method n Data is distributed statically n Data owner – Handles all requests – Notifies readers when their data copy invalid n All processors know where all data is located, since it is statically located

21 February 2001CS-551, Lecture 428 Figure 4.7 Dynamic Data Allocation for Multiple Reader/Single Writer DSM. (Galli,p.96)

21 February 2001CS-551, Lecture 429 Multiple Readers/Multiple Writers n Complex algorithms n Use sequencers – Time read – Time written n May be centralized or distributed

21 February 2001CS-551, Lecture 430 DSM Performance Issues n Thrashing (in a DSM): “when multiple locations desire to modify a common data set” (Galli) n False sharing: Two or more processors fighting to write to the same page, but not the same data n One solution: temporarily freeze a page so one processor can get some work done on it n Another: proper block size (= page size?)

21 February 2001CS-551, Lecture 431 More DSM Performance Issues n Data location (compiler?) n Data access patterns n Synchronization n Real-time systems issue? n Implementation: – Hardware? – Software?

21 February 2001CS-551, Lecture 432 Mach Operating System n Uses virtual memory, distributed shared memory n Mach kernel supports memory objects – “a contiguous repository of data, indexed by byte, upon which various operations, such as read and write, can be performed. Memory objects act as a secondary storage …. Mach allows several primitives to map a virtual memory object into an address space of a task. …In Mach, every task has a separate address space.” (Singhal & Shivaratri, 1994)

21 February 2001CS-551, Lecture 433 Memory Migration n Time-consuming – Moving virtual memory from one processor to another n When? n How much?

21 February 2001CS-551, Lecture 434 MM: Stop and Copy n Least efficient method n Simple n Halt process execution (freeze time) while moving entire process address space and data to new location n Unacceptable to real-time and interactive systems

21 February 2001CS-551, Lecture 435 Figure 4.8 Stop-and-Copy Memory Migration. (Galli,p.99)

21 February 2001CS-551, Lecture 436 Concurrent Copy n Process continues execution while being copied to new location n Some migrated pages may become dirty – Send over more recent versions of pages n At some point, stop execution and migrate remaining data – Algorithms include dirty page ratio and/or time criteria to decide when to stop n Wastes time and space

21 February 2001CS-551, Lecture 437 Figure 4.9 Concurrent-Copy Memory Migration. (Galli,p.99)

21 February 2001CS-551, Lecture 438 Copy on Reference n Process stops n All process state information is moved n Process resumes at new location n Other process pages are moved only when accessed by process n Alternate may have virtual memory pages transferred to file server, then moved as needed to new process location

21 February 2001CS-551, Lecture 439 Figure 4.10 Copy-on- Reference Memory Migration. (Galli,p.100)

21 February 2001CS-551, Lecture 440 Table 4.1 Memory Management Choices Available for Advanced Systems. (Galli,p.101)

21 February 2001CS-551, Lecture 441 Table 4.2 Performance Choices for Memory Management. (Galli,p.101)

21 February 2001CS-551, Lecture 442 Concurrency Control (Chapter 5) n Topics – Mutual Exclusion and Critical Regions – Semaphores – Monitors – Locks – Software Lock Control – Token-Passing Mutual Exclusion – Deadlocks

21 February 2001CS-551, Lecture 443 Critical Region n “the portion of code or program accessing a shared resource” n Must prevent concurrent execution by more than one process at a time n Mutex: mutual exclusion

21 February 2001CS-551, Lecture 444 Figure 5.1 Critical Regions Protecting a Shared Variable. (Galli,p.106)

21 February 2001CS-551, Lecture 445 Mutual Exclusion n Three-point test (Galli) – Solution must ensure that two processes do not enter critical regions at same time – Solution must prevent interference from processes not attempting to enter their critical regions – Solution must prevent starvation

21 February 2001CS-551, Lecture 446 Critical Section Solutions n Recall: Silberschatz & Galvin n A solution to the critical section problem must show that – mutual exclusion is preserved – progress requirement is satisfied – bounded-waiting requirement is met

21 February 2001CS-551, Lecture 447 Figure 5.2 Example Utilizing Semaphores. (Galli,p.109)

21 February 2001CS-551, Lecture 448 Figure 5.3 Atomic Swap. (Galli,p.114)

21 February 2001CS-551, Lecture 449 Figure 5.4 Centralized Lock Manager. (Galli,p.116)

21 February 2001CS-551, Lecture 450 Figure 5.5 Resource Allocation Graph. (Galli,p.120)

21 February 2001CS-551, Lecture 451 Table 5.1 Summary of Support for Concurrency by Hardware, System, and Languages. (Galli,p.124)