Chapter 11 I/O Management and Disk Scheduling

Chapter 11 I/O Management and Disk Scheduling
Operating Systems: Internals and Design Principles Chapter 11 I/O Management and Disk Scheduling “Operating Systems: Internal and Design Principles”, 7/e, by William Stallings, Chapter 11 “I/O Management and Disk Scheduling”. Seventh Edition By William Stallings

Categories of I/O Devices
External devices that engage in I/O with computer systems can be grouped into three categories: suitable for communicating with the computer user printers, terminals, video display, keyboard, mouse Human readable suitable for communicating with electronic equipment disk drives, USB keys, sensors, controllers Machine readable suitable for communicating with remote devices modems, digital line drivers Communication As was mentioned in Chapter 1 , external devices that engage in I/O with computer systems can be roughly grouped into three categories: • Human readable: Suitable for communicating with the computer user. Examples include printers and terminals, the latter consisting of video display, keyboard, and perhaps other devices such as a mouse. Machine readable: Suitable for communicating with electronic equipment. Examples are disk drives, USB keys, sensors, controllers, and actuators. • Communication: Suitable for communicating with remote devices. Examples are digital line drivers and modems.

Data Rates There may be differences of several orders of magnitude between the data transfer rates. Figure 11.1 gives some examples.

Disk Scheduling Processor and main memory speeds increasing
Disks are currently slower than main memory. This gap is expected to continue into the foreseeable future. Thus, the performance of disk storage subsystem is of vital concern, and much research has gone into schemes for improving that performance.

Figure Disk structures: (A) Track (B) Geometrical sector (C) Track sector (D) Cluster

Disk Performance Parameters
The actual details of disk I/O operation depend on the: computer system operating system nature of the I/O channel and disk controller hardware Disk Performance Parameters The actual details of disk I/O operation depend on the computer system, the operating system, and the nature of the I/O channel and disk controller hardware. A general timing diagram of disk I/O transfer is shown in Figure

Positioning the Read/Write Heads
When the disk drive is operating, the disk is rotating at constant speed To read or write the head must be positioned at the desired track and at the beginning of the desired sector on that track Track selection involves moving the head in a movable-head system or electronically selecting one head on a fixed-head system On a movable-head system the time it takes to position the head at the track is known as seek time The time it takes for the beginning of the sector to reach the head is known as rotational delay The sum of the seek time and the rotational delay equals the access time When the disk drive is operating, the disk is rotating at constant speed. To read or write, the head must be positioned at the desired track and at the beginning of the desired sector on that track. 1 Track selection involves moving the head in a movable-head system or electronically selecting one head on a fixed-head system. On a movable-head system, the time it takes to position the head at the track is known as seek time . In either case, once the track is selected, the disk controller waits until the appropriate sector rotates to line up with the head. The time it takes for the beginning of the sector to reach the head is known as rotational delay , or rotational latency. The sum of the seek time, if any, and the rotational delay equals the access time , which is the time it takes to get into position to read or write. Once the head is in position, the read or write operation is then performed as the sector moves under the head; this is the data transfer portion of the operation; the time required for the transfer is the transfer time . In addition to the access time and transfer time, there are several queuing delays normally associated with a disk I/O operation. When a process issues an I/O request, it must first wait in a queue for the device to be available. At that time, the device is assigned to the process. If the device shares a single I/O channel or a set of I/O channels with other disk drives, then there may be an additional wait for the channel to be available. At that point, the seek is performed to begin disk access. In some high-end systems for servers, a technique known as rotational positional sensing (RPS) is used. This works as follows: When the seek command has been issued, the channel is released to handle other I/O operations. When the seek is completed, the device determines when the data will rotate under the head. As that sector approaches the head, the device tries to reestablish the communication path back to the host. If either the control unit or the channel is busy with another I/O, then the reconnection attempt fails and the device must rotate one whole revolution before it can attempt to reconnect, which is called an RPS miss. This is an extra delay element that must be added to the time line of Figure

Disk Schedule Policies Examples
The vertical axis corresponds to the tracks on the disk. The horizontal axis corresponds to time or, equivalently, the number of tracks traversed. For this figure, we assume that the disk head is initially located at track 100. In this example, we assume a disk with 200 tracks and that the disk request queue has random requests in it. The requested tracks, in the order received by the disk scheduler, are 55, 58, 39, 18, 90, 160, 150, 38, 184.

First-In, First-Out (FIFO)
Processes in sequential order Fair to all processes Approximates random scheduling in performance if there are many processes competing for the disk 100->55, 58, 39, 18, 90, 160, 150, 38, 184. The simplest form of scheduling is first-in-first-out (FIFO) scheduling, which processes items from the queue in sequential order. This strategy has the advantage of being fair, because every request is honored and the requests are honored in the order received. Figure 11.7a illustrates the disk arm movement with FIFO. This graph is generated directly from the data in Table 11.2a . As can be seen, the disk accesses are in the same order as the requests were originally received. With FIFO, if there are only a few processes that require access and if many of the requests are to clustered file sectors, then we can hope for good performance. However, this technique will often approximate random scheduling in performance, if there are many processes competing for the disk. Thus, it may be profitable to consider a more sophisticated scheduling policy. A number of these are listed in Table 11.3 and will now be considered.

Priority (PRI) Tracks with priority
Short batch jobs and interactive jobs are given higher priority Provides good interactive response time Longer jobs may have to wait an excessively long time A poor policy for database systems With a system based on priority (PRI), the control of the scheduling is outside the control of disk management software. Such an approach is not intended to optimize disk utilization but to meet other objectives within the operating system. Often short batch jobs and interactive jobs are given higher priority than longer jobs that require longer computation. This allows a lot of short jobs to be flushed through the system quickly and may provide good interactive response time. However, longer jobs may have to wait excessively long times. Furthermore, such a policy could lead to countermeasures on the part of users, who split their jobs into smaller pieces to beat the system. This type of policy tends to be poor for database systems.

Shortest Service Time First (SSTF)
Select the disk I/O request that requires the least movement of the disk arm from its current position Always choose the minimum seek time 100->90, 55, 58, 39, 38, 18, 150, 160, 184. 90 is closer to 100 Shortest Service Time First (SSTF) The shortest-service-time-first (SSTF) policy is to select the disk I/O request that requires the least movement of the disk arm from its current position. Thus, we always choose to incur the minimum seek time. Of course, always choosing the minimum seek time does not guarantee that the average seek time over a number of arm movements will be minimum. However, this should provide better performance than FIFO. Because the arm can move in two directions, a random tie-breaking algorithm may be used to resolve cases of equal distances. Figure 11.7b and Table 11.2b show the performance of SSTF on the same example as was used for FIFO. The first track accessed is 90, because this is the closest requested track to the starting position. The next track accessed is 58 because this is the closest of the remaining requested tracks to the current position of 90. Subsequent tracks are selected accordingly.

SCAN Also known as the elevator algorithm
Arm moves in one direction only satisfies all outstanding requests until it reaches the last track in that direction then the direction is reversed Favors jobs whose requests are for tracks nearest to both innermost and outermost tracks 100->150,160,184,90,58.55,39.38,18 (increase) SCAN With the exception of FIFO, all of the policies described so far can leave some request unfulfilled until the entire queue is emptied. That is, there may always be new requests arriving that will be chosen before an existing request. A simple alternative that prevents this sort of starvation is the SCAN algorithm, also known as the elevator algorithm because it operates much the way an elevator does. With SCAN, the arm is required to move in one direction only, satisfying all outstanding requests en route, until it reaches the last track in that direction or until there are no more requests in that direction. This latter refinement is sometimes referred to as the LOOK policy. The service direction is then reversed and the scan proceeds in the opposite direction, again picking up all requests in order. Figure 11.7c and Table 11.2c illustrate the SCAN policy. Assuming that the initial direction is of increasing track number, then the first track selected is 150, since this is the closest track to the starting track of 100 in the increasing direction. As can be seen, the SCAN policy behaves almost identically with the SSTF policy. Indeed, if we had assumed that the arm was moving in the direction of lower track numbers at the beginning of the example, then the scheduling pattern would have been identical for SSTF and SCAN. However, this is a static example in which no new items are added to the queue. Even when the queue is dynamically changing, SCAN will be similar to SSTF unless the request pattern is unusual. Note that the SCAN policy is biased against the area most recently traversed. Thus it does not exploit locality as well as SSTF. It is not difficult to see that the SCAN policy favors jobs whose requests are for tracks nearest to both innermost and outermost tracks and favors the latest arriving jobs. The first problem can be avoided via the C-SCAN policy, while the second problem is addressed by the N-step-SCAN policy.

C-SCAN (Circular SCAN)
Restricts scanning to one direction only When the last track has been visited in one direction, the arm is returned to the opposite end of the disk and the scan begins again 100->150,160,184,18,38,39,55,58,90 C-SCAN (Circular SCAN) The C-SCAN (circular SCAN) policy restricts scanning to one direction only. Thus, when the last track has been visited in one direction, the arm is returned to the opposite end of the disk and the scan begins again. This reduces the maximum delay experienced by new requests. With SCAN, if the expected time for a scan from inner track to outer track is t , then the expected service interval for sectors at the periphery is 2 t. With C-SCAN, the interval is on the order of t + s max , where s max is the maximum seek time. Figure 11.7d and Table 11.2d illustrate C-SCAN behavior. In this case the first three requested tracks encountered are 150, 160, and 184. Then the scan begins starting at the lowest track number, and the next requested track encountered is 18.

Table 11.2 Comparison of Disk Scheduling Algorithms

Table 11.3 Disk Scheduling Algorithms

Design architectures share three characteristics:
RAID Design architectures share three characteristics: RAID is a set of physical disk drives viewed by the operating system as a single logical drive data are distributed across the physical drives of an array in a scheme known as striping redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure Redundant Array of Independent Disks Consists of seven levels, zero through six With the use of multiple disks, there is a wide variety of ways in which the data can be organized and in which redundancy can be added to improve reliability. This could make it difficult to develop database schemes that are usable on a number of platforms and operating systems. Fortunately, industry has agreed on a standardized scheme for multiple-disk database design, known as RAID (redundant array of independent disks). The RAID scheme consists of seven levels, 2 zero through six. These levels do not imply a hierarchical relationship but designate different design architectures that share three common characteristics: 1. RAID is a set of physical disk drives viewed by the operating system as a single logical drive. 2. Data are distributed across the physical drives of an array in a scheme known as striping, described subsequently. 3. Redundant disk capacity is used to store parity information, which guarantees data recoverability in case of a disk failure.

Table RAID Levels We now examine each of the RAID levels. Table 11.4 provides a rough guide to the seven levels. In the table, I/O performance is shown both in terms of data transfer capacity, or ability to move data, and I/O request rate, or ability to satisfy I/O requests, since these RAID levels inherently perform differently relative to these two metrics. Each RAID level’s strong point is highlighted in color. Of the seven RAID levels described, only four are commonly used: RAID levels 0, 1, 5, and 6.

Not a true RAID because it does not include redundancy to improve performance or provide data protection User and system data are distributed across all of the disks in the array Logical disk is divided into strips RAID Level 0 RAID level 0 is not a true member of the RAID family, because it does not include redundancy to improve performance or provide data protection. However, there are a few applications, such as some on supercomputers in which performance and capacity are primary concerns and low cost is more important than improved reliability. For RAID 0, the user and system data are distributed across all of the disks in the array. This has a notable advantage over the use of a single large disk: If two different I/O requests are pending for two different blocks of data, then there is a good chance that the requested blocks are on different disks. Thus, the two requests can be issued in parallel, reducing the I/O queuing time. But RAID 0, as with all of the RAID levels, goes further than simply distributing the data across a disk array: The data are striped across the available disks. This is best understood by considering Figure All user and system data are viewed as being stored on a logical disk. The logical disk is divided into strips; these strips may be physical blocks, sectors, or some other unit. The strips are mapped round robin to consecutive physical disks in the RAID array. A set of logically consecutive strips that maps exactly one strip to each array member is referred to as a stripe . In an n -disk array, the first n logical strips are physically stored as the first strip on each of the n disks, forming the first stripe; the second n strips are distributed as the second strips on each disk; and so on. The advantage of this layout is that if a single I/O request consists of multiple logically contiguous strips, then up to n strips for that request can be handled in parallel, greatly reducing the I/O transfer time.

Redundancy is achieved by the simple expedient of duplicating all the data
There is no “write penalty” When a drive fails the data may still be accessed from the second drive Principal disadvantage is the cost RAID Level 1 RAID 1 differs from RAID levels 2 through 6 in the way in which redundancy is achieved. In these other RAID schemes, some form of parity calculation is used to introduce redundancy, whereas in RAID 1, redundancy is achieved by the simple expedient of duplicating all the data. Figure 11.8b shows data striping being used, as in RAID 0. But in this case, each logical strip is mapped to two separate physical disks so that every disk in the array has a mirror disk that contains the same data. RAID 1 can also be implemented without data striping, though this is less common. There are a number of positive aspects to the RAID 1 organization: 1. A read request can be serviced by either of the two disks that contains the requested data, whichever one involves the minimum seek time plus rotational latency. 2. A write request requires that both corresponding strips be updated, but this can be done in parallel. Thus, the write performance is dictated by the slower of the two writes (i.e., the one that involves the larger seek time plus rotational latency). However, there is no “write penalty” with RAID 1. RAID levels 2 through 6 involve the use of parity bits. Therefore, when a single strip is updated, the array management software must first compute and update the parity bits as well as updating the actual strip in question. 3. Recovery from a failure is simple. When a drive fails, the data may still be accessed from the second drive. The principal disadvantage of RAID 1 is the cost; it requires twice the disk space of the logical disk that it supports. Because of that, a RAID 1 configuration is likely to be limited to drives that store system software and data and other highly critical files. In these cases, RAID 1 provides real-time backup of all data so that in the event of a disk failure, all of the critical data is still immediately available. In a transaction-oriented environment, RAID 1 can achieve high I/O request rates if the bulk of the requests are reads. In this situation, the performance of RAID 1 can approach double of that of RAID 0. However, if a substantial fraction of the I/O requests are write requests, then there may be no significant performance gain over RAID 0. RAID 1 may also provide improved performance over RAID 0 for data transfer intensive applications with a high percentage of reads. Improvement occurs if the application can split each read request so that both disk members participate.

RAID Level 2 Makes use of a parallel access technique
Data striping is used Typically a Hamming code is used Effective choice in an environment in which many disk errors occur RAID Level 2 RAID levels 2 and 3 make use of a parallel access technique. In a parallel access array, all member disks participate in the execution of every I/O request. Typically, the spindles of the individual drives are synchronized so that each disk head is in the same position on each disk at any given time. As in the other RAID schemes, data striping is used. In the case of RAID 2 and 3, the strips are very small, often as small as a single byte or word. With RAID 2, an error-correcting code is calculated across corresponding bits on each data disk, and the bits of the code are stored in the corresponding bit positions on multiple parity disks. Typically, a Hamming code is used, which is able to correct single-bit errors and detect double-bit errors. Although RAID 2 requires fewer disks than RAID 1, it is still rather costly. The number of redundant disks is proportional to the log of the number of data disks. On a single read, all disks are simultaneously accessed. The requested data and the associated error-correcting code are delivered to the array controller. If there is a single-bit error, the controller can recognize and correct the error instantly, so that the read access time is not slowed. On a single write, all data disks and parity disks must be accessed for the write operation. RAID 2 would only be an effective choice in an environment in which many disk errors occur. Given the high reliability of individual disks and disk drives, RAID 2 is overkill and is not implemented.

Graphical depiction of the 4 data bits d1 to d4 and 3 parity bits p1 to p3 and which parity bits apply to which data bits Bit position of the data and parity bits Mapping in the example x. The parity of the red, green, and blue circles are even.

Requires only a single redundant disk, no matter how large the disk array
Employs parallel access, with data distributed in small strips Can achieve very high data transfer rates RAID Level 3 RAID 3 is organized in a similar fashion to RAID 2. The difference is that RAID 3 requires only a single redundant disk, no matter how large the disk array. RAID 3 employs parallel access, with data distributed in small strips. Instead of an error correcting code, a simple parity bit is computed for the set of individual bits in the same position on all of the data disks. REDUNDANCY In the event of a drive failure, the parity drive is accessed and data is reconstructed from the remaining devices. Once the failed drive is replaced, the missing data can be restored on the new drive and operation resumed. Data reconstruction is simple. Consider an array of five drives in which X0 through X3 contain data and X4 is the parity disk. The parity for the i th bit is calculated as follows: X4(i) = X3(i) X2(i) X1(i) X0(i) where is exclusive-OR function. Suppose that drive X1 has failed. If we add X4(i) X1(i) to both sides of the preceding equation, we get X1(i) = X4(i) X3(i) X2(i) X0(i) Thus, the contents of each strip of data on X1 can be regenerated from the contents of the corresponding strips on the remaining disks in the array. This principle is true for RAID levels 3 through 6. In the event of a disk failure, all of the data are still available in what is referred to as reduced mode. In this mode, for reads, the missing data are regenerated on the fly using the exclusive-OR calculation. When data are written to a reduced RAID 3 array, consistency of the parity must be maintained for later regeneration. Return to full operation requires that the failed disk be replaced and the entire contents of the failed disk be regenerated on the new disk. PERFORMANCE Because data are striped in very small strips, RAID 3 can achieve very high data transfer rates. Any I/O request will involve the parallel transfer of data from all of the data disks. For large transfers, the performance improvement is especially noticeable. On the other hand, only one I/O request can be executed at a time. Thus, in a transaction-oriented environment, performance suffers.

RAID Level 4 Makes use of an independent access technique
A bit-by-bit parity strip is calculated across corresponding strips on each data disk, and the parity bits are stored in the corresponding strip on the parity disk Involves a write penalty when an I/O write request of small size is performed RAID Level 4 RAID levels 4 through 6 make use of an independent access technique. In an independent access array, each member disk operates independently, so that separate I/O requests can be satisfied in parallel. Because of this, independent access arrays are more suitable for applications that require high I/O request rates and are relatively less suited for applications that require high data transfer rates. As in the other RAID schemes, data striping is used. In the case of RAID 4 through 6, the strips are relatively large. With RAID 4, a bit-by-bit parity strip is calculated across corresponding strips on each data disk, and the parity bits are stored in the corresponding strip on the parity disk. RAID 4 involves a write penalty when an I/O write request of small size is performed. Each time that a write occurs, the array management software must update not only the user data but also the corresponding parity bits. Consider an array of five drives in which X0 through X3 contain data and X4 is the parity disk. Suppose that a write is performed that only involves a strip on disk X1. Initially, for each bit i , we have the following relationship: X4(i) = X3(i) X2(i) X1(i) X0(i) (11.1) After the update, with potentially altered bits indicated by a prime symbol: X4=(i) = X3(i) X2(i) X1=(i) X0(i) = X3(i) X2(i) X1=(i) X0(i) X1(i) X1(i) = X3(i) X2(i) X1 (i) X0(i) X1(i) X1=(i) = X4(i) X1(i) X1=(i) The preceding set of equations is derived as follows. The first line shows that a change in X1 will also affect the parity disk X4. In the second line, we add the terms [ X1(i) X1(i)] . Because the exclusive-OR of any quantity with itself is 0, this does not affect the equation. However, it is a convenience that is used to create the third line, by reordering. Finally, Equation ( 11.1 ) is used to replace the first four terms by X4( i ). To calculate the new parity, the array management software must read the old user strip and the old parity strip. Then it can update these two strips with the new data and the newly calculated parity. Thus, each strip write involves two reads and two writes. In the case of a larger size I/O write that involves strips on all disk drives, parity is easily computed by calculation using only the new data bits. Thus, the parity drive can be updated in parallel with the data drives and there are no extra reads or writes. In any case, every write operation must involve the parity disk, which therefore can become a bottleneck.

Diagram of a RAID 4 setup with dedicated parity disk with each color representing the group of blocks in the respectiveparity block (a stripe)

Similar to RAID-4 but distributes the parity bits across all disks
Typical allocation is a round-robin scheme Has the characteristic that the loss of any one disk does not result in data loss RAID Level 5 RAID 5 is organized in a similar fashion to RAID 4. The difference is that RAID 5 distributes the parity strips across all disks. A typical allocation is a round-robin scheme, as illustrated in Figure 11.8f . For an n -disk array, the parity strip is on a different disk for the first n stripes, and the pattern then repeats. The distribution of parity strips across all drives avoids the potential I/O bottleneck of the single parity disk found in RAID 4. Further, RAID 5 has the characteristic that the loss of any one disk does not result in data loss.

Two different parity calculations are carried out and stored in separate blocks on different disks
Provides extremely high data availability Incurs a substantial write penalty because each write affects two parity blocks RAID Level 6 RAID 6 was introduced in a subsequent paper by the Berkeley researchers [KATZ89]. In the RAID 6 scheme, two different parity calculations are carried out and stored in separate blocks on different disks. Thus, a RAID 6 array whose user data require N disks consists of N + 2 disks. Figure 11.8g illustrates the scheme. P and Q are two different data check algorithms. One of the two is the exclusive-OR calculation used in RAID 4 and 5. But the other is an independent data check algorithm. This makes it possible to regenerate data even if two disks containing user data fail. The advantage of RAID 6 is that it provides extremely high data availability. Three disks would have to fail within the MTTR (mean time to repair) interval to cause data to be lost. On the other hand, RAID 6 incurs a substantial write penalty, because each write affects two parity blocks. Performance benchmarks [EISC07] show a RAID 6 controller can suffer more than a 30% drop in overall write performance compared with a RAID 5 implementation. RAID 5 and RAID 6 read performance is comparable.

Disk Cache Cache memory is used to apply to a memory that is smaller and faster than main memory and that is interposed between main memory and the processor Reduces average memory access time by exploiting the principle of locality Disk cache is a buffer in main memory for disk sectors Contains a copy of some of the sectors on the disk In Section 1.6 and Appendix 1A, we summarized the principles of cache memory. The term cache memory is usually used to apply to a memory that is smaller and faster than main memory and that is interposed between main memory and the processor. Such a cache memory reduces average memory access time by exploiting the principle of locality. The same principle can be applied to disk memory. Specifically, a disk cache is a buffer in main memory for disk sectors. The cache contains a copy of some of the sectors on the disk. When an I/O request is made for a particular sector, a check is made to determine if the sector is in the disk cache. If so, the request is satisfied via the cache. If not, the requested sector is read into the disk cache from the disk. Because of the phenomenon of locality of reference, when a block of data is fetched into the cache to satisfy a single I/O request, it is likely that there will be future references to that same block. when an I/O request is made for a particular sector, a check is made to determine if the sector is in the disk cache if YES the request is satisfied via the cache if NO the requested sector is read into the disk cache from the disk

Least Recently Used (LRU)
Most commonly used algorithm that deals with the design issue of replacement strategy The block that has been in the cache the longest with no reference to it is replaced A stack of pointers reference the cache most recently referenced block is on the top of the stack when a block is referenced or brought into the cache, it is placed on the top of the stack The most commonly used algorithm is least recently used (LRU) Replace that block that has been in the cache longest with no reference to it. The cache consists of a stack of blocks, with the most recently referenced block on the top of the stack. When a block in the cache is referenced, it is moved from its existing position on the stack to the top of the stack. When a block is brought in from secondary memory, remove the block that is on the bottom of the stack and push the incoming block onto the top of the stack. It is not necessary actually to move these blocks around in main memory; a stack of pointers can be associated with the cache.

Least Frequently Used (LFU)
The block that has experienced the fewest references is replaced A counter is associated with each block Counter is incremented each time block is accessed When replacement is required, the block with the smallest count is selected Another possibility is least frequently used (LFU) : Replace that block in the set that has experienced the fewest references. LFU could be implemented by associating a counter with each block. When a block is brought in, it is assigned a count of 1; with each reference to the block, its count is incremented by 1. When replacement is required, the block with the smallest count is selected. Intuitively, it might seem that LFU is more appropriate than LRU because LFU makes use of more pertinent information about each block in the selection process. A simple LFU algorithm has the following problem. It may be that certain blocks are referenced relatively infrequently overall, but when they are referenced, there are short intervals of repeated references due to locality, thus building up high reference counts. After such an interval is over, the reference count may be misleading and not reflect the probability that the block will soon be referenced again. Thus, the effect of locality may actually cause the LFU algorithm to make poor replacement choices.

Summary I/O architecture is the computer system’s interface to the outside world I/O functions are generally broken up into a number of layers A key aspect of I/O is the use of buffers that are controlled by I/O utilities rather than by application processes Buffering smoothes out the differences between the speeds The use of buffers also decouples the actual I/O transfer from the address space of the application process Disk I/O has the greatest impact on overall system performance Two of the most widely used approaches are disk scheduling and the disk cache A disk cache is a buffer, usually kept in main memory, that functions as a cache of disk block between disk memory and the rest of main memory The computer system’s interface to the outside world is its I/O architecture. This architecture is designed to provide a systematic means of controlling interaction with the outside world and to provide the operating system with the information it needs to manage I/O activity effectively. The I/O function is generally broken up into a number of layers, with lower layers dealing with details that are closer to the physical functions to be performed and higher layers dealing with I/O in a logical and generic fashion. The result is that changes in hardware parameters need not affect most of the I/O software. A key aspect of I/O is the use of buffers that are controlled by I/O utilities rather than by application processes. Buffering smoothes out the differences between the internal speeds of the computer system and the speeds of I/O devices. The use of buffers also decouples the actual I/O transfer from the address space of the application process. This allows the operating system more flexibility in performing its memory-management function. The aspect of I/O that has the greatest impact on overall system performance is disk I/O. Accordingly, there has been greater research and design effort in this area than in any other kind of I/O. Two of the most widely used approaches to improve disk I/O performance are disk scheduling and the disk cache. At any time, there may be a queue of requests for I/O on the same disk. It is the object of disk scheduling to satisfy these requests in a way that minimizes the mechanical seek time of the disk and hence improves performance. The physical layout of pending requests plus considerations of locality come into play. A disk cache is a buffer, usually kept in main memory, that functions as a cache of disk blocks between disk memory and the rest of main memory. Because of the principle of locality, the use of a disk cache should substantially reduce the number of block I/O transfers between main memory and disk.

Chapter 11 I/O Management and Disk Scheduling

Similar presentations

Presentation on theme: "Chapter 11 I/O Management and Disk Scheduling"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 11 I/O Management and Disk Scheduling

Similar presentations

Presentation on theme: "Chapter 11 I/O Management and Disk Scheduling"— Presentation transcript:

Similar presentations

About project

Feedback