The Buffer Cache.

The Buffer Cache

Memory Hierarchy

Now we know that the files are stored on the hard drive and the processes can access these files and create new files on the disk. When a process requests for a file the kernel brings the file into the main memory where user process can change, read or access the file. The kernel could read and write the file directly from the hard disk and put it in memory and vice versa but the response time and throughput will be very low in this case because of disks sow data transfer speed.

To minimize the frequency of disk usage/access the kernel keeps a buffer to store the recently accessed files and/or frequently accessed files. This buffer is called the buffer cache.

Buffer Cache Buffer cache is not same as a Cache memory.
Main Memory Buffer Cache Buffer Cache Buffer cache is not same as a Cache memory. Buffer cache is a part of Main Memory which contains different blocks of data from secondary memory. Usually buffer cache is maintained in system area of Main memory When partition Main memory in to different partition then it allocates one or two partitions to system area , so it is managed by Operating system only.

Secondary Storage Device
Main Memory Buffer Cache 1st Communication takes place between Buffer cache and Secondary device Buffer Cache is array or Pool of Buffer area.

Access Mechanism : Each Buffer contains two area : Header & Data Area
Data Area – Contains data from block of disk. Suppose if I have 50 Buffers , then 50 blocks containing data from disk Access Mechanism : First time if there is cache miss followed by Main memory miss then access data from secondary storage. Before going to secondary storage first it searches in Buffer cache for block of data.

Access Mechanism : When the process want to read a file the kernel attempts to read this file in the buffer cache, if the data is found in the buffer cache the data/file is sent to the process. If the file is not found in the buffer cache then the file is read from the disk and then kept in the buffer cache so that it can be made available to the process. To minimize the disk access frequency the kernel may also implement the pre-caching or write delay functionalities.

If Buffer cache is not free then , have to victimize one of the buffer cache , Means every buffer cache contains some block of secondary storage , So out of that push-out one of the block from one of the buffer cache to make it free to contain a new block from secondary storage. If it is available in buffer cache then no need to take from secondary disk , transfer data from buffer cache to user area, that takes less amount of time. If data is not available then go to secondary storage and access a particular block , read that in to one of buffer cache provided that buffer cache is free.

For that a replacement algorithm have to use is Least replacement algorithm
Header area is for buffer management & Data Area contains block of data of size 256 bytes, 512 bytes , 1024 bytes.

Buffer Headers When the system initializes the kernel allocates the space for the buffer cache. The buffer cache contains two regions. One for the data/files that will be read from the disk, second the buffer header. The data in the buffer cache corresponds to the logical blocks of the disk block of file system. The buffer cache is “in memory” representation of the disk blocks.

There will never be a case when the buffer has two entries for the same file on disk as this could lead to inconsistencies. There is only and only one copy of a file in the buffer. The buffer header contains the metadata information like device number and the block number range for which this buffer holds the data. The buffer header also contains pointer to a data array for the buffer (i.e. pointer to the data region) .

The buffer header also contains the status of the buffer
The buffer header also contains the status of the buffer. The status of the buffer could be Locked/unlocked (busy /free) Buffer contains a valid data or not. Whether the kernel should write the contents to disk immediately or before reassigning the buffer(write delay) Kernel is currently reading the data or writing the data. Is there any process waiting for the buffer to get free. The kernel must write buffer contents to disk before reassigning the buffer this condition is known as "delayed-write".

Buffer Header

Replacement algorithm have to use is Least replacement algorithm
This can be implemented with the help of linked list , because there are number of pointers in buffer headers, it is quite obvious that buffer are maintained in linked list form. In this case linked list will be doubly linked list, because we have 2 pointers for every queue one pointing to next buffer in queue and other pointing to previous buffer queue Two consecutive elements are linked by previous and next pointer and the last node points to first node by next pointer and also the previous pointer of the head node points to the tail node.

Whenever a buffer has to be taken for overwriting the data block by new data block then how this is done.

Header node points to forward direction , buffers points to forward and backward pointers , this forms a doubly linked circular list. Header Node

Let us assume all these buffers are free list, that means header node is Header of free list and all the buffers in doubly linked list are free i.e Buffers are not used by any process. Let us assume data blocks which are contained in these buffer list are 2,15,32 .

Header node points to forward direction , buffers points to forward and backward pointers , this forms a doubly linked circular list. Header Node 2 15 32

Now suppose a process request a block 10, if it does not exist in any of the buffer then block no 10 has to be read in and it has to be put in one of the buffer , which are currently free , may not be empty. In such case , we can take out buffer which is overwritten i.e always take from first buffer of free list. Always take buffer from head of the list. Overwrite this with data from block no 10

Block 10 Header 2 15 32

Block 10 Header 2 15 32 10

So while writing continuous this buffer will be locked , when the writing is over then buffer will be free. Now buffer is written with data from block no 10 i.e. this is one which is most recently used buffer.

Free list Header 15 32 10

Whenever I have to take a buffer for writing in a data block , in that case always take a buffer from the header of the free list. Whenever I want to return a buffer to free list , I will always return at the tail of the list. Whenever a buffer just been used and it becomes free then that is the buffer “Most recently used” Most recently used buffer are placed always at the tail of the list.

Most Recently used least Recently used Free list Header 15 32 10

Now suppose some process wants to request data block no – 12
1st check in buffer cache, if it is not in Buffer cache then only read it from disk of block no 12 and put it into one of the buffer cache. WORST CASE: if block no 12 does not exist and if there are 1000 of buffer cache then we have to check all 1000 buffer cache before I declare block no -12 does not exist. i.e search time to find out a particular block is present or not is quite high IF I MAINTAINED A SINGLE LIST

So instead of maintaining a single list , Maintain a Number of list i
So instead of maintaining a single list , Maintain a Number of list i.e called as HASH QUEUE. Suppose I have 4 hash queue, will put those buffers in hash queue which contains a block number say n Mod operation n mod 4 n means a block number contained in buffer So we decide which buffer will be put in which hash queue Ex; if n mod 4 is zero ,then buffer will be put in HQ0

N mod 4 HQ 0 Suppose if N=6 6 mod 4 = 2 Then buffer containing block no 6 will present in HQ2 HQ 1 HQ 2 HQ 3

N mod 4 HQ 0 28 4 64 HQ 1 17 5 97 HQ 2 98 50 10 HQ 3 3 35 99

Free list is subset of nodes which are already there in different hash queues.
For example :

HQ 0 28 4 64 HQ 1 17 5 97 HQ 2 98 50 10 HQ 3 3 35 99 Free list Header

In above situation some of buffer are free, which exist in free list and buffers which are not free exist only in hash queue , they don’t exist in free list.

Structure of the buffer pool
The kernel catches the least recently used data into the buffer pool. The kernel also maintains a free list of buffers. The free list is a doubly circular list of buffers. When kernel wants to allocate any buffer it removes a node from the free list, usually from the beginning of list but is could take it from middle of the list too. When kernel frees a node from the buffer list it adds this free node at the end of the free list.

When kernel want to access the disk it searches the buffer pool for a particular device number-block number combination (which is maintained in the buffer header). The entire buffer pool is organized as queues hashed as a function of device number-block number combination. The figure down below shows the buffers on their hash queues

The important thing to note here is that no two nodes in the buffer pool can contain the data of same disk block i.e. same file.

Example Suppose a Process request for Block no: 9
i.e. 9 mod 4 = 1 , buffer containing block no:9 will exist only in hash queue.(have to search only HQ.1) If there is no buffer containing block no :9, then block does not exist in buffer cache.(Checks by using Device no and Block no) So search time is reduced to a great extent when we distribute buffers in to a number of hash queues.

Different scenarios- Algorithm
When a process puts request for a particular block, suppose block no:9 First we have to check whether it is present in hash queue of buffers. Two situations , either it may present or may not present in buffer cache. If it is present in buffer cache we have two situations, that is buffer may be currently locked (i.e it is used by some other process) Second is it finds data is present in buffer cache and it is free , immediately acquired.

Other case can be , process request for data it goes to hash queue 1 and finds block no:9 is not present in HQ1( Block no:9 does not exist in buffer cache) Then we have to get a node or buffer from free list and overwrite that buffer with data from block no:9 Various situations will arise First is free list is empty i,.e there is no buffer which is currently free. (Process has to wait until some buffer can free to overwrite data) Another situation is we get node on buffer list but is marked as Delayed write.

Scenarios of retrieval of buffer
High level kernel algorithms in file subsystem invoke the algorithms of buffer pool to manage the buffer cache. The algorithm for reading and writing disk blocks uses the algorithm getblk to allocate buffer from the pool.

GetBlock (file_system_no,block_no)
{ while (buffer not found) if (buffer in hash queue) if (buffer busy) sleep (event buffer becomes free) continue } mark buffer busy remove buffer from free list return buffer

Else { if (there is no buffer on free list) sleep (event any buffer becomes free) continue } remove buffer from free list if (buffer marked as delayed write) asyschronous white buffer to disk remove buffer from hash queue put buffer onto hash queue return buffer

The five typical scenarios that kernel may follow in getblk to allocate a buffer in the disk block are 1. The kernel finds the block on its hash queue, and its buffer is free. 2. The kernel cannot find the block on the hash queue, so it allocates a buffer from the free list. 3. The kernel cannot find the block on the hash queue and, in attempting to allocate a buffer from the free list (as in scenario 2), finds a buffer on the free list that has been marked “Delayed write." The kernel must write the delayed write" buffer to disk and allocate another buffer. 4. The kernel cannot find the block on the hash queue, and the free list of buffers is empty. 5. The kernel finds the block on the hash queue, but its buffer is currently busy.

Program for Retrival of a Buffer using getblock()
Implementation Concept: Buffer pool according to LRU The kernel maintains a free list of buffer doubly linked list take a buffer from the head of the free list. When returning a buffer, attaches the buffer to the tail.

When the kernel accesses a disk block
separate queue (doubly linked circular list) hashed as a function of the device and block num Every disk block exists on one and only once on the queue Determine the logical device num and block num

4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers Figure 3.3 Buffers on the Hash Queues

The algorithms for reading and writing disk blocks use the algorithm getblk
The kernel finds the block on its hash queue The buffer is free. The buffer is currently busy. The kernel cannot find the block on the hash queue The kernel allocates a buffer from the free list. In attempting to allocate a buffer from the free list, finds a buffer on the free list that has been marked “delayed write”. The free list of buffers is empty.

Retrieval of a Buffer:1st Scenario (a)
The kernel finds the block on the hash queue and its buffer is free 4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header Search for block 4 52

Retrieval of a Buffer:1st Scenario (b)
4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 freelist header Remove block 4 from free list 53

Before continuing to other scenarios lets see what happens after the buffer is allocated. The kernel may read the data, manipulate it and/or change it in the buffer. While doing so the kernel marks the buffer as busy so that no other process can access this block. When the kernel is done using this block it releases the buffer using brelse algorithm.

Retrieval of a Buffer: 2nd Scenario (a)
The kernel cannot find the block on the hash queue, so it allocates a buffer from free list 4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header Search for block 18: Not in cache 55

Retrieval of a Buffer: 2nd Scenario (b)
Hash queue headers 18 4 5 17 10 50 98 99 35 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 freelist header Remove 1st block from free list: Assign to 18 56

Retrieval of a Buffer: 3rd Scenario (a)
The kernel cannot find the block on the hash queue, and finds delayed write buffers on hash queue 4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header delay delay Search for block 18, Delayed write blocks on free list 57

Retrieval of a Buffer: 3rd Scenario (b)
5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header 18 writing (b) Writing Blocks 3, 5, Reassign 4 to 18 Figure 3.8 58

Retrieval of a Buffer: 4th Scenario
The kernel cannot find the buffer on the hash queue, and the free list is empty 28 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header 4 64 5 97 17 10 50 98 99 35 3 Search for block 18, free list empty 59

Race for free buffer

Retrieval of a Buffer: 5th Scenario
Kernel finds the buffer on hash queue, but it is currently busy 4 5 17 10 50 98 99 35 3 28 64 97 blkno0 mod 4 blkno1 mod 4 blkno2 mod 4 blkno3 mod 4 Hash queue headers freelist header busy Search for block 99, block busy 61

Race for a Locked buffer

Output:

Algorithms for Reading and writing disk blocks

Advantages of the buffer cache
Uniform disk access => system design simpler (bcz the kernel does not need to know the reason for the I/O) Copying data from user buffers to system buffers(and vice versa) =>the kernel eliminates the need for special alignment of user buffers, making user programs simpler and more portable. Use of the buffer cache can reduce the amount of disk traffic, thereby increasing overall system throughput and decreasing response time. Single image of disk blocks contained in the cache => helps insure file system integrity (prevents data corruption) 2down voteaccepted In the context of MPI, application buffer (often called user buffer) is the buffer that holds information to be sent or the place where information is to be received. Applications buffers are what one passes to MPI communication calls, e.g.

Disadvantages of the buffer cache
Since the kernel does not immediately write data to disk for a Delayed write =>the system vulnerable to crashes that leave disk data in incorrect state Use of the buffer cache requires an extra data copy when reading and writing to and from user processes => slow down performance when transmitting large data

The Buffer Cache.

Similar presentations

Presentation on theme: "The Buffer Cache."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Buffer Cache.

Similar presentations

Presentation on theme: "The Buffer Cache."— Presentation transcript:

Similar presentations

About project

Feedback