1 Shared Files Sharing files among team members A shared file appearing simultaneously in different directories Share file by link File system becomes a DAG Convenient E.g., build shortcut to specific file Dir A Dir BDir C File 1 Link
2 How to Share A File? If directory entry has addresses of blocks How about new appended blocks? Addresses of Disk blocks stored separately UNIX i-node approach Symbolic linking: create a link file containing the path name Dir A Dir BDir C File 1 Directory entry contains disk address Dir A Dir BDir C File 1 i- node Dir A Dir BDir C File 1 Link file../Dir C/File1 Symbolic linking
3 Problems of Shared Files Ownership problem (i-node approach) After C removes the directory entry, B cannot remove file 1, since B is not the owner! Symbolic links will not have this problem Overhead for parsing the path name An extra i-node Owner=C Count=1 Dir C Owner=C Count=1 File 1 Dir C Owner=C Count=2 File 1 Dir BDir C File 1 Dir B i-node
4 Block Size Larger block size more internal fragmentation On average, 1 KB/file, if 32-KB block is used, 97% space is wasted! Smaller block size slow reading Spend time on seek and rotational delay Find an optimal size: trade-off
5 Keeping Track of Free Sectors Free list: linked list of disk sectors, each sector holding as many free disk sector numbers as will fit Store free list in free sectors … … 47 Free list: 14 15 … 17 1-KB sector can hold bit disk sector number
6 Bitmap for Free Space Each block is represented by a bit in bitmap 0: available 1: used ( or vice versa) Requires less space than free list if disk has much free space Stored in a fixed place on disk … … Bitmap
7 File System Reliability Replacing a computer is acceptable Restoring all the information of a file system is difficult, time consuming, or impossible What are the issues related to reliability of file systems? Backup/restore File system consistency
8 Backups Why? Recover from disaster, e.g., fire Recover from stupidity, e.g., oops, I deleted the file by mistake Considerations about backups Backup only specific directories/files no temp files, no system binary files Incremental dumps, only back up changes Compress data before back up A single bad spot makes entire file/tape unreadable
9 Physical Dump Writes all the disk blocks onto the tape in order Simple, reliable, fast No incremental dump Cannot skip selected directories Cannot avoid free blocks/bad blocks Cannot restore individual files Not used popularly
10 Logical Dump Dump the selected files/directories Full dump Incremental dump If a file is dumped, all the directories on the path to this file also need to be dumped If not, where to put the files during restore? Before file /d1/d2/f is restored, directories /d1 and /d1/d2 must be restored How about free block list and links?
11 Restore Why dump directories first? Directories are restored before files, giving a skeleton of the file system Restoring algorithm Create an empty file system Restore the most recent full dump Recover directories first Recover files Apply the first incremental dump, and so on Reconstruct free block list Make sure linked files are restored only once
12 File System Consistency Access to blocks: read modify write back What about a crash before all the modified blocks have been written out? Especially critical for i-node blocks, directory blocks or blocks containing the free list Check file system consistency Blocks and files
13 Check for Block Consistency Two arrays of counters: one counter per block Used-array: how many times a block appears in files Free-array: how many times a block appears in free list Read all the i-nodes, update used-array Examine free list, update free-array
14 Classify Blocks Normal block: have a 1 in either used- array or free-array Used by one file or unused Missing block: have no 1 in both array Add them to the free list Duplicate in free list: >1 in free-array Rebuild free list Duplicate in files: have >1 in used-array Make copies and insert into files
15 Checking Files Each file has a counter, initially zero For each file, if a directory contains it, the counter of that file increases by 1 Only hard-links are counted. For each file, compare the counters with the links count in its i-node Links count in i-node = real count Links count in i-node > real count Links count in i-node < real count
16 File System Performance How to reduce the time of accessing files? Caching Block read ahead Reducing disk arm motion
17 Caching Reserve a set of blocks in main memory as disk sectors cache How cache works? Maintenance of the cache Like page replacement: FIFO, LRU, etc. Hash table Front (LRU) Rear (MRL)
18 Write Important Blocks Back First Write critical blocks back to disk immediately after they are updated Reduce the probability of inconsistency greatly Write-through cache: modified blocks are written back immediately Don’t keep data blocks in memory for too long Force synchronization periodically (per 30 sec)
19 Block Read Ahead If a file is read sequentially, read block (k+1) when block k is in used by a process If a file is randomly accessed, read ahead wastes bandwidth Detect the access patterns for open files Switch between read ahead or not according to current pattern
20 Reducing Disk Arm Motion Put blocks that are likely to be accessed in sequence close to each other De-fragmentation Allocate i-nodes properly They are heavily accessed Near the start of disk: not good The distance between i-node and data block Put them into the middle of the disk: reduce average seek
21 Summary Files Naming, attributes, operations Directories Hierarchical system Path name Implementation For files: continuous, linked list, i-node For directories: attributes and file name Free space management: linked list&bitmap Reliability: backup and consistency check