COP 5611 Operating Systems Spring 2010

Slides:



Advertisements
Similar presentations
I/O Systems & Mass-Storage Systems
Advertisements

I/O Management and Disk Scheduling
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
I/O and Networking Fred Kuhns
I/O Systems.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
COT 4600 Operating Systems Spring 2011 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 5:00-6:00 PM.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface.
04/14/2008CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying the textbook.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 2, 2005 Chapter 13: I/O Systems I/O Hardware.
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
04/16/2010CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying an earlier edition.
I/O Systems CS 3100 I/O Hardware1. I/O Hardware Incredible variety of I/O devices Common concepts ◦Port ◦Bus (daisy chain or shared direct access) ◦Controller.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
I/O Systems CSCI 444/544 Operating Systems Fall 2008.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
CHAPTER 13: I/O SYSTEMS Overview Overview I/O Hardware I/O Hardware I/O API I/O API I/O Subsystem I/O Subsystem Transforming I/O Requests to Hardware Operations.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13+14: I/O Systems and Mass- Storage Structure I/O Hardware Application I/O.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 2, 2005 Chapter 13: I/O Systems I/O Hardware.
Operating Systems CMPSC 473 I/O Management (1) November Lecture 23 Instructor: Bhuvan Urgaonkar.
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
ITEC 502 컴퓨터 시스템 및 실습 Chapter 8-2: I/O Management (Review) Mi-Jung Choi DPNM Lab. Dept. of CSE, POSTECH.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 AE4B33OSS Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O.
I/O Systems I/O Hardware Application I/O Interface
1 Module 12: I/O Systems n I/O hardware n Application I/O Interface n Kernel I/O Subsystem n Transforming I/O Requests to Hardware Operations n Performance.
Cosc 4740 Chapter 12 I/O Systems. I/O Hardware Incredible variety of I/O devices –Storage –Transmission –Human-interface.
Chapter 13: I/O Systems. 13.2/34 Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware.
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 3:00-4:00 PM.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Principles Chapter 13: I/O Systems I/O Hardware Application I/O Interface.
Silberschatz, Galvin and Gagne  Operating System Concepts Six Step Process to Perform DMA Transfer.
XE33OSA Chapter 13: I/O Systems. 13.2XE33OSA Silberschatz, Galvin and Gagne ©2005 Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel.
CENG334 Introduction to Operating Systems Erol Sahin Dept of Computer Eng. Middle East Technical University Ankara, TURKEY URL:
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 2, 2005 Chapter 13: I/O Systems I/O Hardware.
Chapter 13: I/O Systems Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 13: I/O Systems Overview I/O Hardware Application.
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O.
CMSC 421 Section 0202 I/O Systems Chapter 13: I/O Systems.
FILE SYSTEM IMPLEMENTATION 1. 2 File-System Structure File structure Logical storage unit Collection of related information File system resides on secondary.
COT 4600 Operating Systems Spring 2011 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 5:00-6:00 PM.
COT 4600 Operating Systems Spring 2011 Dan C. Marinescu Office: HEC 304 Office hours: Tu-Th 5:00-6:00 PM.
Chapter 13: I/O Systems.
Lecture 13 Input/Output (I/O) Systems (chapter 13)
Module 12: I/O Systems I/O hardware Application I/O Interface
Chapter 13: I/O Systems Modified by Dr. Neerja Mhaskar for CS 3SH3.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Operating System I/O System Monday, August 11, 2008.
Chapter 13: I/O Systems.
CSCI 315 Operating Systems Design
Chapter 9: Virtual-Memory Management
COT 4600 Operating Systems Fall 2009
CGS 3763 Operating Systems Concepts Spring 2013
I/O Systems.
I/O Systems I/O Hardware Application I/O Interface
Operating System Concepts
13: I/O Systems I/O hardwared Application I/O Interface
Selecting a Disk-Scheduling Algorithm
CS703 - Advanced Operating Systems
CGS 3763 Operating Systems Concepts Spring 2013
COT 4600 Operating Systems Spring 2011
COP 5611 Operating Systems Spring 2010
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Chapter 13: I/O Systems.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Operating Systems I/O System Alok Kumar Jagadev.
Chapter 13: I/O Systems.
COT 5611 Operating Systems Design Principles Spring 2014
CGS 3763 Operating Systems Concepts Spring 2013
Module 12: I/O Systems I/O hardwared Application I/O Interface
Presentation transcript:

COP 5611 Operating Systems Spring 2010 Dan C. Marinescu Office: HEC 439 B Office hours: M-Wd 2:00-3:00 PM

Lecture 8 Last time: Thread coordination and scheduling Today: Multi-level memories I/O bottleneck Next Time: Chapter 8. Network as a System and as a System Component 2 2 2 2

Multi-level memories In the following hierarchy the amount of storage and the access time increase at the same time CPU registers L1 cache L2 cache Main memory Magnetic disk Mass storage systems Remote storage Memory management schemes  where the data is placed through this hierarchy Manual  left to the user Automatic  based on memory virtualization More effective Easier to use 3 3

4 4

Forms of memory virtualization Memory-mapped files  in UNIX mmap Copy on write  when several threads use the same data map the page holding the data and store the data only once in memory. This works as long all the threads only READ the data. If one of the threads carries out a WRITE then the virtual memory handling should generate an exception and data pages to be remapped so that each thread gets its only copy of the page. On-demand zero filled pages Instead of allocating zero-filled pages on RAM or on the disk the VM manager maps these pages without READ or WRITE permissions. When a thread attempts to actually READ or WRITE to such pages then an exception is generated and the VM manager allocates the page dynamically. Virtual-shared memory  Several threads on multiple systems share the same address space. When a thread references a page that is not in its local memory the local VM manager fetches the page over the network and the remote VM manager un-maps the page. 5 5

Multi-level memory management and virtual memory Two level memory system: RAM + disk. Each page of an address space has an image in the disk The RAM consists of blocks. READ and WRITE from RAM  controlled by the VM manager GET and PUT from disk  controlled by a multi-level memory manager Old design philosophy: integrate the two to reduce the instruction count New approach – modular organization Implement the VM manager (VMM) in hardware. Translates virtual addresses into physical addresses. Implement the multi-level memory manager (MLMM) in the kernel in software. It transfers pages back and forth between RAM and the disk 6 6

The modular design VM attempts to translate the virtual memory address to a physical memory address If the page is not in main memory VM generates a page-fault exception. The exception handler uses a SEND to send to an MLMM port the page number The SEND invokes ADVANCE which wakes up a thread of MLMM The MMLM invokes AWAIT on behalf of the thread interrupted due to the page fault. The AWAIT releases the processor to the SCHEDULER thread. 7

8

Name resolution in multi-level memories We consider pairs of layers: Upper level of the pair  primary Lower level of the pair  secondary The top level managed by the application which generates LOAD and STORE instructions to/from CPU registers from/to named memory locations The processor issues READs/WRITEs to named memory locations. The name goes to the primary memory device located on the same chip as the processor which searches the name space of the on-chip cache (L1 cache), the primary device with the L2 cache as secondary device. If the name is not found in L1 cache name space the Multi-Level Memory Manager (MLMM) looks at the L2 cache (off-chip cache) which becomes the primary with the main memory as secondary. If the name is not found in the L2 cache name space the MLMM looks at the main memory name space. Now the main memory is the primary device. If the name is not found in the main memory name space then the Virtual Memory Manager is invoked 9

The performance of a two level memory The latency Lp << LS LP  latency of the primary device e.g., 10 nsec for RAM LS  latency of the secondary device, e.g., 10 msec for disk Hit ratio h the probability that a reference will be satisfied by the primary device. Average Latency (AS)  AS = h x LP + (1-h) LS. Example: LP = 10 nsec (primary device is main memory) LS = 10 msec (secondary device is the disk) Hit ratio h= 0.90  AS= 0.9 x 10 + 0.1 x 10,000,000 = 1,000,000.009 nsec~ 1000 microseconds = 1 msec Hit ratio h= 0.99  AS= 0.99 x 10 + 0.01 x 10,000,000 = 100,000.0099 nsec~ 100 microseconds = 0.1 msec Hit ratio h= 0.999  AS= 0.999 x 10 + 0.001 x 10,000,000 = 10,000.0099 nsec~ 10 microseconds = 0.01 msec Hit ratio h= 0.9999  AS= 0.999 0x 10 + 0.001 x 10,000,000 = 1,009.99 nsec~ 1 microsecond This considerable slowdown is due to the very large discrepancy (six orders of magnitude) between the primary and the secondary device. 10

The performance of a two level memory (cont’d) Statement: if each reference occurs with equal frequency to a cell in the primary and in the secondary device then the combined memory will operate at the speed of the secondary device. The size SizeP << SizeS SizeS =K x SizeP with K large (1/K small) SizeP  number of cells of the primary device SizeS  number of cells of the secondary device 11

Locality of reference Concentration of references Spatial locality of reference Temporal locality of reference Reasons for locality of references Programs consists of sets of sequential instructions interrupted by branches Data structures group together related data elements Working set  the collection of references made by an application in a given time window. If the working set is larger than the number of cells of the primary device significant performance degradation. 12

Memory management elements at each level The string of references directed at that level. The capacity at that level The bring in policies On demand  bring the cell to the primary device from the secondary device when it is needed. E.g., demand paging Anticipatory. E.g. pre-paging The replacement policies FIFO  First in first out OPTIMAL  what a clairvoyant multi-level memory manager would do. Alternatively, construct the string of references and use it for a second execution of the program (with the same data as input). LRU – Least Recently Used  replace the page that has not been referenced for the longest time. MSU – Most Recently Used  replace the page that was referenced most recently 13

Page replacement policies; Belady’s anomaly In the following examples we use a given string of references to illustrate several page replacement policies. We consider a primary device (main memory) with a capacity of three or four blocks and a secondary device (the disk) where a replica of all pages reside. Once a block has the “dirty bit” on it means that the page residing in that block was modifies and must be written back to the secondary device before being replaced. The capacity of the primary device is important. One expects that increasing the capacity, in our case the number of blocs in RAM leads to a higher hit ratio. That is not always the case as our examples will show. This is the Belady’s anomaly. Note: different results are obtained with a different string of references!! 14

FIFO Page replacement algorithm PS: Primary storage Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - 4 3 Block 2 in PS 1 Bloch 3 in PS 2 Block 4 in PS Page OUT Page IN 10 15

OPTIMAL page replacement algorithm Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - 3 Block 2 in PS 1 Bloch 3 in PS 2 Block 4 in PS 4 Page OUT Page IN 6 16 16

LRU page replacement algorithm Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - Block 2 in PS 1 Bloch 3 in PS 2 4 Block 4 in PS 3 Page OUT Page IN 7 17

LRU, OPTIMAL, MRU LRU looks only at history OPTIMAL “knows” not only the history but also the future. In some particular cases Most Recently Used Algorithm performs better than LRU. Example: primary device with 4 cells. Reference string 0 1 2 3 4 0 1 2 3 4 0 1 2 3 4 LRU F F F F F F F F F F F F F F F MRU F F F F F - - - F - - - F - - 18

The OPTIMAL replacement policy keeps in the 3-blocks primary memory the same pages as it does in case of the 4-block primary memory. Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - 3 Block 2 in PS 1 Bloch 3 in PS 2 Block 4 in PS 4 Page OUT Page IN 6 19

The FIFO replacement policy does not keep in the 3-blocks primary memory the same pages as it does in case of the 4-block primary memory. Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - 4 3 Block 2 in PS 1 Bloch 3 in PS 2 Block 4 in PS Page OUT Page IN 10 20

The LRU replacement policy keeps in the 3-blocks primary memory the same pages as it does in case of the 4-block primary memory. Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - Block 2 in PS 1 Bloch 3 in PS 2 4 Block 4 in PS 3 Page OUT Page IN 7 21

The FIFO replacement policy does not keep in the 3-blocks primary memory the same pages as it does in case of the 4-block primary memory Time intervals 1 2 3 4 5 6 7 8 9 10 11 12 Total number of page faults Reference string Block 1 in PS - Block 2 in PS Block 3 in PS Page OUT Page IN Block 1 in PS - 4 3 Block 2 in PS 1 Bloch 3 in PS 2 Block 4 in PS Page OUT Page IN 10 22 22

How to avoid Belady’s anomaly The OPTIMAL and the LRU algorithms have the subset property, a primary device with a smaller capacity hold a subset of the pages a primary device with a larger capacity could hold. The subset property creates a total ordering. If the primary system has one block and contains page A then a system with two blocks adds page B, and a system with three blocks will add page C. Thus we have a total ordering AB  C or (A,B,C) Replacement algorithms that have the subset property are called “stack” algorithms. If we use stack replacement algorithms a device with a larger capacity can never have more page faults than the one with a smaller capacity. m the pages held by a primary device with smaller capacity n  the pages held by a primary device with larger capacity m is a subset of n 23

Simulation analysis of page replacement algorithms Given a reference string we can carry out the simulation for all possible cases when the capacity of the primary storage device varies from 1 to n with a single pass. At each new reference some page moves to the top of the ordering and the pages that were above it either move down or stay in the same place as dictated by the replacement policy. We record whether this movement correspond to paging out, movement to the secondary storage. 24

Simulation of LRU page replacement algorithm Time 1 2 3 4 5 6 7 8 9 10 11 12 Reference string Stack contents after reference - 1 2 3 4 5 Total number of page faults Size 1 in/out 0/- 1/0 2/1 3/2 0/3 4/1 0/4 4/3 12 Size 2 in/out 1/- 2/0 3/1 0/2 1/3 4/0 0/1 1/4 4/2 Size 3 in/out 2/- 3/0 1/2 -/- 2/4 10 Size 4 in/out 3/- 2/3 3/4 8 Size 5 in/out 4/- 5 25

Simulation of OPTIMUM 1 2 3 4 5 6 7 8 9 10 11 12 - 1 2 3 4 Time Reference string Stack contents after reference - 1 2 3 4 Total number of page faults Size 1 victim - 1 2 3 4 11 Size 2 victim 10 Size 3 victim 7 Size 4 victim 6 Size 5 victim 5 26

Clock replacement algorithm Approximates LRU with a minimum Additional hardware: one reference bit for each page Overhead Algorithm activated : when a new page must be brought in move the pointer of a virtual clock in clockwise direction if the arm points to a block with reference bit TRUE Set it FALSE Move to the next block if the arm points to a block with reference bit FALSE The page in that block could be removed (has not been referenced for a while) Write it back to the secondary storage if the “dirty” bit is on (if the page has been modified. 27

28

The I/O bottleneck An illustration of the principle of incommensurate scaling  CPU and memory speed increase at a faster rate than those of mechanical I/O devices limited by the laws of Physics. Example: hard drives The average seek time (AST): AST = 8 msec average rotation latency (ARL): rotation speed: 7200 rotation/minute 120 rotations /second (8.33 msec/rotation)  ARL =4.17 msec A typical 400 Gbyte disk 16,383 cylinders  24 Mbyte/cylinder 8 two-sided platters  16 tracks/cylinder 24/16 MBytes/track 1.5 Mbyte/track The maximum rate transfer rate of the disk drive is: 120 revolutions/sec x 1.5 Mbyte/track=180 Mbyte/sec The bus transfer rates (BTR): ATA3 bus  3 Gbytes/sec IDE bus 66 Mbyte/sec. This is the bottleneck!! The average time to read a 4 Kbyte block: AST+ARL+4 /180 = 8 + 4.17 + 0.02 = 12.19 msec The throughput: 328 Kbytes/sec.

I/O bottleneck If the application consists of a loop: (read a block of data, compute for 1 msec, write back) and if the block are stored sequentially on the disk thus we can read a full track at once ( speculative execution of the I/O) we have a write-though buffer so that we can write a full track at one (batching) then the execution time can be considerably reduced. The time per iteration: read time + compute time + write time Initially: 12.19 + 1 + 12.19 = 25.38 msec With speculative reading of an entire track and overlap of reading and writing Read an entire track of 1.5 Mbyte  reads the data for 384=1,500/4 iterations The time for 384 iterations: Fixed delay: average seek time + 1 rotational delay: 8 + 8.33 msec= 16.33 msec Variable delay: 384(compute time + data transfer time)= 384(1+12.19)= 5065 msec Total time: 16.33 +5,065= 5,081 msec 30

31

Disk writing strategies Keep in mind that buffering data before writing to the disk has implications; if the system fails then the data is lost. Strategies: Write-through  write to the disk before the write system call returns to the user application User-controlled write through a force call. At the time the file is closed After a predefined number of write calls or after a pre-defined time. 32

Communication among asynchronous sub-systems: polling versus interrupts Polling periodically checking the status of an I/O device Interrupt  deliver data or status information when status information immediately . Intel Pentium Vector Table 33

Interrupts: used for I/O and for exceptions CPU Interrupt-request line  triggered by I/O device Interrupt handler receives interrupts To mask an interrupt  ignore or delay some interrupts Interrupt vector to dispatch interrupt to correct handler Based on priority Some non-maskable 34

Direct Memory Access (DMA) DMA  Bypasses CPU to transfer data directly between I/O device and memory; it allows subsystems within the computer to access system memory for reading and/or writing independently of CPU: disk controller, graphics cards, network cards, sound cards, GPUs (graphics processors),   also used for intra-chip data transfer in multi-core processors,.  Avoids programmed I/O for large data movement Requires DMA controller 35

DMA Transfer 36

Device drivers and I/O system calls Multitude of I/O devices Character-stream or block Sequential or random-access Sharable or dedicated Speed of operation Read-write, read only, or write only Device-driver layer hides differences among I/O controllers from kernel: I/O system calls encapsulate device behaviors in generic classes 37

Block and Character Devices Block devices (e.g., disk drives, tapes) Commands e.g., read, write, seek Raw I/O or file-system access Memory-mapped file access possible Character devices (e.g., keyboards, mice, serial ports) Commands e.g., get, put Libraries allow line editing 38

Network Devices and Timers Own interface different from bloc or character devices Unix and Windows NT/9x/2000 include socket interface Separates network protocol from network operation Includes select functionality Approaches vary widely (pipes, FIFOs, streams, queues, mailboxes) Timers Provide current time, elapsed time, timer Programmable interval timer for timings, periodic interrupts ioctl (on UNIX) covers odd aspects of I/O such as clocks and timers 39

Blocking and non-blocking I/O Blocking  process suspended until I/O completed Easy to use and understand Insufficient for some needs Non-blocking  I/O call returns control to the process immediately User interface, data copy (buffered I/O) Implemented via multi-threading Returns quickly with count of bytes read or written Asynchronous  process runs while I/O executes I/O subsystem signals process when I/O completed 40

Synchronous/Asynchronous I/O 41

Kernel I/O Subsystem Scheduling Some I/O request ordering using per-device queue Some OSs try fairness Buffering – store data in memory while transferring to I/O device. To cope with device speed mismatch or transfer size mismatch To maintain “copy semantics” 42

Sun Enterprise 6000 Device-Transfer Rates 43

Kernel I/O Subsystem and Error Handling Caching  fast memory holding copy of data Always just a copy Key to performance Spooling  holds output for a device that can serve only one request at a time (e.g., printer). Device reservation  provides exclusive access to a device System calls for allocation and de-allocation Possibility of deadlock Error handling: OS can recover from disk read, device unavailable, transient write failures When I/O request fails error code. System error logs hold problem reports 44

I/O Protection I/O instructions are priviledged Users make system calls 45

Kernel Data Structures for I/O handling Kernel keeps state info for I/O components, including open file tables, network connections, device control blocs Complex data structures to track buffers, memory allocation, “dirty” blocks Some use object-oriented methods and message passing to implement I/O 46

UNIX I/O Kernel Structure 47

Hardware Operations Operation for reading a file: Determine device holding file Translate name to device representation Physically read data from disk into buffer Make data available to the process Return control to process 48

STREAMS in Unix STREAM  a full-duplex communication channel between a user-level process and a device in Unix System V and beyond A STREAM consists of: - STREAM head interfaces with the user process - driver end interfaces with the device - zero or more STREAM modules between them. Each module contains a read queue and a write queue Message passing is used to communicate between queues 49

STREAMS 50

I/O major factor in system performance: Execute device driver, kernel I/O code Context switches Data copying Network traffic stressful 51

Improving Performance Reduce number of context switches Reduce data copying Reduce interrupts by using large transfers, smart controllers, polling Use DMA Balance CPU, memory, bus, and I/O performance for highest throughput 52