Download presentation
1
Chapter 6 File Systems 6.1 Files 6.2 Directories
6.3 File system implementation 6.4 Example file systems
2
Files Requirements for long term information storage:
Must store large amounts of data Information stored must survive the termination of the process using it Multiple processes must be able to access the information concurrently Solution: Store information on disk or other external media in units called files. The file system is a part of operating system that manages files.
3
Files File naming - file name.file extention File structure
unstructured sequence of bytes. MS-DOS/UNIX record sequence CP/M B-tree.
4
Typical file extensions.
File Naming Typical file extensions.
5
File Structure Three kinds of files byte sequence record sequence tree
6
Files File types Regular files contain user information either ASCII or binary. Directories are system files for maintaining the structure of the file system. Character special files are used to model serial I/O devices such as terminals, printers, and networks. Block special files are used to model disks. A UNIX executable file stats with a magic number, identifying the file as an executable file.
7
File Types (a) An executable file (b) An archive
8
File Access Sequential access Random access
read all bytes/records from the beginning cannot jump around, could rewind or back up convenient when medium was magnetic tape Random access bytes/records read in any order essential for database systems Two methods are used for specifying where to start reading. read and then move file marker move file marker (seek), then read
9
Possible file attributes
Operating systems associate extra information with each file, called file attributes. Possible file attributes
10
File Operations Create Append Delete Seek Open Get attributes Close
Read Write Append Seek Get attributes Set Attributes Rename
11
An Example Program Using File System Calls
12
An Example Program Using File System Calls
13
Memory-Mapped Files (Ex11.c)
To facilitate access to files, systems provides system calls to map files into the address space of a running process and remove (unmap) the files from the address space. File services as a backing store for the process and when the process finishes all mapped, modified pages are written back to their files. Advantage: eliminate need for I/O. Disadvantage: difficult to know size of output file. In case of all zeroes,  10 0's ?? or 100 0's ?? Mapped file modified by one process is read differently by another. Two processes need to see consistent views of the file. file may be too large to fit.
14
Memory-Mapped Files (a) Segmented process before mapping files into its address space (b) Process after mapping existing file abc into one segment creating new segment for xyz
15
Directories File systems have directories or folders to keep track of files. A single-level directory has one directory (root) containing all the files. A two-level directory has a root directory and user directories. A hierarchical directory has a root directory and arbitrary number of subdirectories. Two different methods are used to specify file names in a directory tree: Absolute path name consists of the path from the root directory to the file. Relative path name consists of the path from the current directory (working directory).
16
Directories - A single level directory system
contains 4 files owned by 3 different people, A, B, and C
17
Two-level Directory Systems
Letters indicate owners of the directories and files
18
Hierarchical Directory Systems
A hierarchical directory system
19
Directories The path name would be written:
Winodws \usr\ast\mailbox UNIX /usr/ast/mailbox MULTICS >usr>ast>mailbox Dot and dot dot are two special entries in the file system. Dot (.) refers to the current directory. Dot dot (..) refers to its parent.
20
Path Names A UNIX directory tree
21
Directory Operations Readdir Create Rename Delete Link Opendir Unlink
Closedir
22
File System Implementation
File system layout: MRB (Master Boot Record) is used to boot the computer. The partition table gives the starting and ending addresses of each partition. Partitions: The first block, boot block, of the active partition is read in by the MRB program when the system is booted. The superblock contains all the key parameters about the file system. Free blocks information i-nodes tells all about the file. Root directory Directories and files
23
File System Implementation
A possible file system layout
24
File System Implementation
Implementing file storage is keeping track of which disk blocks go with which files. Contiguous Allocation - store each file as contiguous block of data. Advantage: Simple to implement Read performance is excellent Disadvantage: Disk fragmentation The maximum file size must be known when file is created. Example: CD-ROMs, DVDs, and write-once optical media Linked List Allocation - keep linked list of disk blocks random access slow amount of data in a block not a power of 2
25
Implementing Files (a) Contiguous allocation of disk space for 7 files
(b) State of the disk after files D and E have been removed
26
Storing a file as a linked list of disk blocks
Implementing Files Storing a file as a linked list of disk blocks
27
File System Implementation
Linked List Allocation using an index - take table pointer word from each block and put them in an index table, FAT (File Allocation Table) in memory. Disadvantage - entire table must be in memory all the time I-node (index-node) lists the attributes and disk addresses of the file's blocks.
28
Linked list allocation using a file allocation table in RAM
Implementing Files Linked list allocation using a file allocation table in RAM
29
Implementing Files An example i-node
30
Implementation directories
When a file is opened, the file system uses the path name to locate the directory entry. The directory provides information needed to find the disk blocks. disk address of the entire file (contiguous blocks) the number of first block (linked list) the number of i-node (i-node) Where to store attributes? In directory or i-node?
31
Implementing Directories
(a) A simple directory – MS-DOS/Windows fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node - UNIX
32
Implementation directories
Handling long file names in a directory: Fixed-length names (Waste space) In-line (When a file is removed, a variable-sized gap is introduced.) Heap (The heap management needs extra effort.) How to search files in each directory? Linearly Hash table Cache the results of searches
33
Implementing Directories
Two ways of handling long file names in directory (a) In-line (b) In a heap
34
Shared files A shared file is used to allow a file to appear in several directories. The connection between a directory and the shared file is called a link. The file system is a Directed Acyclic Graph (DAG). Problem: If directories contain disk address, a copy of the disk address will have to be made in directory B. What if A or B append the file, the new blocks will only appear in one directory.
35
Shared files Solution:
Do not list disk block addresses in directories but in a little data structure. (use i-nodes)  (hard link) Create a new file of type link which contains the path name of the file to which it is linked  symbolic linking
36
File system containing a shared file
Shared Files File system containing a shared file
37
Shared files ln file1 file2
Problem of hard link - when should i-node be removed? Suppose A: rm file2  could set count = 1 and leave i-node intact. when count = 0, delete file and i-node. Problem above does not occur in symbolic link because only the owner directory has a pointer to i-node. The problem is extra overhead in the traversing path. Other problem is having multiple copies of a file may set copied when dumping an files onto a disk (tar). do not descend path involving symbolic links.
38
Shared Files (a) Situation prior to linking
(b) After the link is created (c)After the original owner removes the file
39
Disk space management Strategies for storing an n byte file:
Allocate n consecutive bytes of disk space - segment Allocate a number [n/k] blocks of size k bytes each - paging Problem – if the file grows it will have to be moved on the disk, it is an expensive operation and causes external fragmentation. Solution – All file systems chop files up into fixed-size blocks that need not to be adjacent.
40
Block size When block size increase, disk space utilization decrease (space efficiency decrease and internal fragmentation). When block size decrease, data transfer rate decrease (time efficiency decrease) usual size k = 512bytes, 1k (UNIX), or 2k
41
Disk Space Management Block size Dark line (left hand scale) gives data rate of a disk Dotted line (right hand scale) gives disk space efficiency All files 2KB
42
Block size Example: disk with 131072 bytes per track.
rotation time = 8.33 msec average seek time = 10 msec.  time to read a block of k bytes = /2 + (k/131072) * 8.33 = k/ * 8.33 If k = 1 KB = 1024 bytes = / * 8.33 = = msec Disk space efficiency = % of block used by data. Observation: Assume that all files are 1 kbytes, on average 1/2 of last block is empty.
43
Keeping Track OF Free Blocks
Use linked list of disk blocks: each block holds as many free disk block numbers as will fit. With 1 KB block and 32-bit disk block number  1024 * 8/32 = 256 disk block numbers  255 free blocks (and) 1 next block pointer. Use bit-map: A disk with (n) blocks requires a bit map with (n) bits Free blocks are represented by 1's Allocated blocks represented by 0's 16GB disk has KB and requires 224 bits  2048 blocks Using a linked list = 224/255 = blocks. However, these blocks can be freed up as the disk is filled up. Bit map generally better if it can be kept completely in memory.
44
Disk Space Management (a) Storing the free list on a linked list
(b) A bit map
45
Disk Space Management (a) Almost-full block of pointers to free disk blocks in RAM - three blocks of pointers on disk (b) Result of freeing a 3-block file (c) Alternative strategy (split the full block of pointers) for handling 3 free blocks - shaded entries are pointers to free disk blocks
46
Quotas for keeping track of each user’s disk use
Disk Space Management Quotas for keeping track of each user’s disk use
47
File System Reliability
The loss of a file system can be catastrophic. Methods to safeguard a file system: Bad Block Management Backups Hardware solution - dedicate a sector to a "bad block list“ when disk controller is initiated, the bad block list is read and a spare block is picked to replace each bad block. The mapping is recorded in the bad block list. Software solution - user or file system carefully construct a file containing all the bad blocks
48
File System Reliability
Backups are made to handle: recover from disaster or stupidity. Considerations of backups Entire or part of the file system Incremental dumps: dump only files that have changed Compression Backup an active file system Security
49
File System Reliability
Two strategies can be used for dumping a disk to tape: Physical dump: starts at block 0 to the last one. Advantages: simple and fast Disadvantages: backup everything Logical dump: starts at one or more specified directories and recursively dumps all files and directories found that have changed since some given base date.
50
File System Reliability
File that has not changed A file system to be dumped squares are directories, circles are files shaded items, modified since last dump each directory & file labeled by i-node number
51
File System Reliability
Bit maps used by the logical dumping algorithm (After 4 phases, the dump is complete.) All modified files and all directories are marked. Unmark any directories that have no modified files. Dump all the directories marked for dumping. Dump the marked files.
52
File System Consistency
A utility program, called a file system checker (fsck in UNIX or scandisk in Windows), can be used to test the consistency of a file system. Two types of consistency checks can be made: (a) blocks (b) files (directory)
53
File System Consistency
Block consistency: Build two tables with a counter per block, initially set to 0 The counters in the first table keep track of number of times each block is present in a file. The counters of the second table record the number of times in free list, Then, the program reads all the i-nodes and uses the i-nodes to build a list of all blocks used in the files (incrementing file counter as each block is read). Check free list or bit map to find all blocks not in use (increment free list counter for each block in free list).
54
File System Reliability
File system states (a) consistent (b) missing block – add it to the free list (c) duplicate block in free list – rebuild the free list (d) duplicate data block – copy the block to a free block
55
File System Consistency
For checking directories – keep a list of counters per file starting at the root directory, recursively inspect each directory. For each file, increment the counter for that file’s (i-node) usage count . Compare computed value with link count stored in each i-node. i-node link count > computed value = number of directory entries Even if all files are removed, the i-node link count > 0. So the i-node will not be removed. Solution : set i-node link count = value computed i-node link count < computed value The i-node may be freed even when there is another directory points to it directory will be pointing to unused i-node solution : set i-node link count = computed value Protecting the user - rm * .o
56
File System Performance
A block cache or buffer cache is a collection of blocks that logically belong on the disk, but are kept in memory to improve performance. All of the previous paging replacement algorithms can be used to determine which block should be written when a new block is needed and the cache is full. Modified LRU Scheme: The block is not likely to be needed again soon? No => go to front The block is essential for the file system to be consistency e.g. i- nodes, etc ? Yes => write immediately
57
File System Performance
Periodically, all data block should be written out (e.g. write works all day). UNIX - system call sync forces modified blocks out to the disk immediately. Hard-disk oriented. e.g. update runs in background during sync every 30 seconds MS-DOS - write-through cache => all modified blocks are written immediately. Floppy disk oriented. e.g. write a 1K block one character at a time UNIX collect then together MS-DOS 1 at a time
58
File System Performance
The block cache data structures
59
File System Performance
Reading a block needs one access for the i-node and one for the block. Save i-node access time. (a) I-nodes placed at the start of the disk (b) Disk divided into cylinder groups each with its own blocks and i-nodes
60
Log-Structured File Systems
With CPUs faster, memory larger disk caches can also be larger increasing number of read requests can come from cache thus, most disk accesses will be writes LFS Strategy structures entire disk as a log have all writes initially buffered in memory periodically write these at the end of the disk log when file opened, locate i-node by the i-node map, then find blocks
61
Example File Systems CD-ROM File Systems
The ISO 9660 directory entry
62
The CP/M File System Memory layout of CP/M
63
The CP/M directory entry format
The CP/M File System The CP/M directory entry format
64
The MS-DOS directory entry
The MS-DOS File System The MS-DOS directory entry
65
The MS-DOS File System Maximum partition for different block sizes
The empty boxes represent forbidden combinations
66
The Windows 98 File System
Bytes The extended MOS-DOS directory entry used in Windows 98
67
The Windows 98 File System
Bytes Checksum An entry for (part of) a long file name in Windows 98
68
The Windows 98 File System
An example of how a long name is stored in Windows 98
69
A UNIX V7 directory entry
The UNIX V7 File System A UNIX V7 directory entry
70
The UNIX V7 File System A UNIX i-node
71
The steps in looking up /usr/ast/mbox
The UNIX V7 File System The steps in looking up /usr/ast/mbox
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.