Presentation is loading. Please wait.

Presentation is loading. Please wait.

File System Implementations

Similar presentations


Presentation on theme: "File System Implementations"— Presentation transcript:

1 File System Implementations
CS-4513 Distributed Computing Systems (Slides include materials from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne, Distributed Systems: Principles & Paradigms, 2nd ed. By Tanenbaum and Van Steen, and Modern Operating Systems, 2nd ed., by Tanenbaum) Assumptions: Graduate level Operating Systems Making Choices about operation systems Why a micro-century? …just about enough time for one concept CS-4513 D-term 2008 File System Implementations

2 Review – the File abstraction
A (potentially) large amount of information or data that lives a (potentially) very long time Often much larger than the memory of the computer Often much longer than any computation Sometimes longer than life of machine itself (Usually) organized as a linear array of bytes or blocks Internal structure is imposed by application (Occasionally) blocks may be variable length (Often) requiring concurrent access by multiple processes Even by processes on different machines! CS-4513 D-term 2008 File System Implementations

3 File System Implementations
File Systems and Disks User view File is a named, persistent collection of data OS & file system view File is collection of disk blocks — i.e., a container File System maps file names and offsets to disk blocks CS-4513 D-term 2008 File System Implementations

4 This topic – Implementation of Files
Create file abstraction using physical disk devices and disk blocks Efficient in time, space, use of disk resources Fast enough for application requirements Must be scalable to a wide variety of file sizes Many small files (< 1 page) Huge files (100’s of gigabytes, terabytes, spanning disks) Everything in between CS-4513 D-term 2008 File System Implementations

5 File System Implementations
Overview File Allocation methods Sequential Linked and FAT Indexed Free blocks & bad blocks Scalability Mounting & Virtual File Systems Directory implementation notes CS-4513 D-term 2008 File System Implementations

6 File System Implementations
Definition — Volume The fundamental unit of a file system Physical volume may be a physical disk storage device physical partition of a single disk (aka minidisk) Logical Volume is A physical volume A combination of other volumes Usually similar in size and characteristics Volume is also used (loosely) to denote the part of the file system implemented on a physical or logical volume CS-4513 D-term 2008 File System Implementations

7 File Allocation Schemes
Contiguous Blocks of file stored in consecutive disk sectors Directory points to first entry Linked Blocks of file scattered across disk, as linked list Indexed Separate index block contains pointers to file blocks Directory points to index block CS-4513 D-term 2008 File System Implementations

8 File Allocation Schemes (continued)
The allocation scheme is an attribute of a file system, not of individual files within a system. All files within a file system follow same allocation model CS-4513 D-term 2008 File System Implementations

9 File Allocation Schemes
Contiguous Blocks of file stored in consecutive disk sectors Directory points to first entry Linked Blocks of file scattered across disk, as linked list Indexed Separate index block contains pointers to file blocks Directory points to index block CS-4513 D-term 2008 File System Implementations

10 Contiguous Allocation
Ideal for large, static files Databases, fixed system structures, OS code Multi-media video and audio CD-ROM, DVD Simple address calculation Directory entry points to first sector File block i  disk sector address Fast multi-block reads and writes Minimize seeks between blocks CS-4513 D-term 2008 File System Implementations

11 Contiguously Allocated Files
CS-4513 D-term 2008 File System Implementations

12 Block-to-sector Calculation
To find disk sector containing block i of file f Starting_block(f) + i Starting block of each file is named in Directory, or File metadata CS-4513 D-term 2008 File System Implementations

13 File Creation (Contiguous File System)
Search for an empty sequence of blocks First-fit Best-fit Prone to fragmentation when … Files come and go Files change size Similar to physical memory allocation in base-limit type of virtual memory CS-4513 D-term 2008 File System Implementations

14 Contiguous Allocation – Extents
Extent: a contiguously allocated subset of a file Directory entry points to (For file with one extent) the extent itself (For file with multiple extents) pointer to an extent block describing multiple extents Advantages Speed, ease of address calculation of contiguous file Avoids (some of) the fragmentation issues Can be adapted to support files across multiple disks CS-4513 D-term 2008 File System Implementations

15 Contiguous Allocation – Extents
Disadvantages Too many extents  degenerates to indexed allocation As in Unix-like systems, but not so well Popular in 1960s & 70s OS/360, other systems for commercial data processing Currently used for large files in NTFS Rarely mentioned in textbooks Silbershatz, § & CS-4513 D-term 2008 File System Implementations

16 Digression: Bad Block Management
Bad blocks on disks are inevitable Part of manufacturing process (less than 1%) Most are detected during formatting Occasionally, blocks become bad during operation Manufacturers typically add extra tracks to disks Physical capacity = (1 + x) * rated_capacity Who handles bad blocks? Disk controller: Bad block list maintained internally Automatically substitutes good blocks Formatter: Re-organize track to avoid bad blocks OS: Bad block list maintained by OS, bad blocks never used CS-4513 D-term 2008 File System Implementations

17 Bad Block Management in Contiguous Allocation File Systems
Bad blocks must be concealed Foul up the block-to-sector calculation Methods Look-aside list of bad sectors Check each sector request against hash table If present, substitute a replacement sector behind the scenes Spare sectors in each track, remapped by formatting Handling Disk controller, invisible to OS Lower levels of OS; concealed from higher layers of file system and from application CS-4513 D-term 2008 File System Implementations

18 File System Implementations
Questions? CS-4513 D-term 2008 File System Implementations

19 File Allocation Schemes
Contiguous Blocks of file stored in consecutive disk sectors Directory points to first entry Linked Blocks of file scattered across disk, as linked list Indexed Separate index block contains pointers to file blocks Directory points to index block CS-4513 D-term 2008 File System Implementations

20 File System Implementations
Linked Allocation Blocks scattered across disk Each block contains pointer to next block Directory points to first and last blocks Sector header: Pointer to next block ID and block number of file 10 16 01 25 CS-4513 D-term 2008 File System Implementations

21 Linked Allocation (Note)
This is Silbershatz figure 11.5 Links in the book are incorrect 10 16 01 25 CS-4513 D-term 2008 File System Implementations

22 File System Implementations
Linked Allocation Advantages No space fragmentation! Easy to create, extend files Ideal for lots of small files Disadvantages Lots of disk arm movement Space taken up by links Sequential access only! Random access simulated by caching links Used in Xerox Alto file system CS-4513 D-term 2008 File System Implementations

23 Bad Block Management – Linked File Systems
In OS:– format all sectors of disk Don’t reserve any spare sectors Allocate bad blocks to a hidden file for the purpose If a block becomes bad, “append” to the hidden file Advantages Very simple No look-aside or sector remapping needed Totally transparent without any hidden mechanism CS-4513 D-term 2008 File System Implementations

24 Linked File System – Limitation
To access the ith block, it is necessary to physically read the first (i1) blocks from disk! Serious problem for large files! CS-4513 D-term 2008 File System Implementations

25 Solution – File Allocation Table (FAT)
Instead of link on each block, put all links in one table — the FAT Each entry corresponds to physical block in disk ith entry contains link for block i Each entry points to next block (or EOF) Directory points to first & last blocks of file CS-4513 D-term 2008 File System Implementations

26 File System Implementations
FAT File Systems Advantages Advantages of Linked File System FAT can be cached in memory Searchable at CPU speeds, pseudo-random access Disadvantages Limited size, not suitable for very large disks FAT cache describes entire disk, not just open files! Not fast enough for large databases Used in MS-DOS, early Windows systems Also USB Flash drives, floppy disks, etc. CS-4513 D-term 2008 File System Implementations

27 Bad Block Management – FAT File Systems
Same as Linked File Systems I.e., format all sectors of disk Don’t reserve any spare sectors Allocate bad blocks to a hidden file for the purpose If a block becomes bad, append to the hidden file Same advantages and disadvantages CS-4513 D-term 2008 File System Implementations

28 File System Implementations
Disk Defragmentation Re-organize blocks in disk so that file is (mostly) contiguous Link or FAT organization preserved Purpose: To reduce disk arm movement during sequential accesses To permit contiguous reads or writes in one disk operation Does not change the linked structure of the file system! CS-4513 D-term 2008 File System Implementations

29 Exam Question (last spring)
You have a humongous database stored in a file on a 4 GB flash drive with a FAT file system. What must the file system do to locate block n of the database? Assume that database has not been defragmented, so that its blocks are likely to be scattered randomly across the flash drive. Given that the file system has found the location of block n, what must it do to find the location of block n+1? block n-1? CS-4513 D-term 2008 File System Implementations

30 Linked and FAT File Systems
Questions? Linked and FAT File Systems CS-4513 D-term 2008 File System Implementations

31 File Allocation Schemes
Contiguous Blocks of file stored in consecutive disk sectors Directory points to first entry Linked Blocks of file scattered across disk, as linked list Indexed Separate index block contains pointers to file blocks Directory points to index block CS-4513 D-term 2008 File System Implementations

32 File System Implementations
Problems and Issues Contiguous file systems not suitable for files that come and go rapidly & frequently Linked file systems not suitable for very large disks or a lot of random access Can we do better? CS-4513 D-term 2008 File System Implementations

33 File System Implementations
Indexed Allocation i-node: Contains file metadata Lists sector address of each block of file Advantages True random access Only i-nodes of open files need to be cached Supports small and large files CS-4513 D-term 2008 File System Implementations

34 File System Implementations
Unix/Linux i-nodes Direct blocks: Pointers to first n sectors Single indirect table: Extra block containing pointers to blocks n+1 .. n+m Double indirect table: Extra block containing single indirect blocks CS-4513 D-term 2008 File System Implementations

35 File System Implementations
Indexed Allocation Access to every block of file is via i-node Bad block management Similar to Linked/FAT systems Disadvantage Not as fast as contiguous allocation for large databases Requires reference to i-node for every access vs. Simple calculation of block to sector address CS-4513 D-term 2008 File System Implementations

36 Indexed Allocation (continued)
Widely used in Unix, Linux, Windows NTFS Robust Has withstood the test of time Many variations CS-4513 D-term 2008 File System Implementations

37 File System Implementations
Questions? CS-4513 D-term 2008 File System Implementations

38 Free Block Management in File Systems
Bitmap Very compact on disk Expensive to search Supports contiguous allocation Free list Linked list of free blocks Each block contains pointer to next free block Only head of list needs to be cached in memory Very fast to search and allocate Contiguous allocation very difficult CS-4513 D-term 2008 File System Implementations

39 Free Block Management Bit Vector
1 2 n-1 0  block[i] free 1  block[i] occupied bit[i] =  Free block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit CS-4513 D-term 2008 File System Implementations

40 Free Block Management Bit Vector (continued)
Bit map Must be kept both in memory and on disk Copy in memory and disk may differ Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk CS-4513 D-term 2008 File System Implementations

41 Free Block Management Bit Vector (continued)
Solution: Set bit[i] = 1 in disk Allocate block[i] Set bit[i] = 1 in memory Similarly for set of contiguous blocks Potential for lost blocks in event of crash! Discussion:– How do we solve this problem? CS-4513 D-term 2008 File System Implementations

42 Free Block Management Linked List
Linked list of free blocks Not necessarily in order! Cache first few free blocks in memory Head of list must be stored both On disk In memory Each block must be written to disk when freed Potential for losing blocks? CS-4513 D-term 2008 File System Implementations

43 Free Block Management – Linked List (continued)
Can also be implemented in FAT Can also be implemented in Indexed File system CS-4513 D-term 2008 File System Implementations

44 File System Implementations
Reading Assignment Silbershatz, Chapter 11 Ignore §11.8 – 11.10 or Tanenbaum MOS (2nd ed.), Chapter 6 §6.3, §6.4 CS-4513 D-term 2008 File System Implementations

45 Scalability of File Systems
Question: How large can a file be? Answer: limited by Number of bits in length field in metadata Size & number of block entries in FAT or i-node Question: How large can file system be? CS-4513 D-term 2008 File System Implementations

46 File System Implementations
MS-DOS & Windows FAT-12 (primarily on floppy disks): byte blocks Only 4086 blocks usable! FAT-16 (early hard drives): 64 K blocks; block sizes up to 32 K bytes 2 GBytes max per partition, 4 partitions per disk FAT-32 (Windows 95) 228 blocks; up to 2 TBytes per disk Max size FAT requires 232 bytes in RAM! CS-4513 D-term 2008 File System Implementations

47 MS-DOS File System (continued)
Maximum partition for different block sizes The empty boxes represent forbidden combinations CS-4513 D-term 2008 File System Implementations

48 File System Implementations
Classical Unix Maximum number of i-nodes = 64K! How many files in a modern PC? I-node structure allows very large files, but … Limited by size of internal fields Limited by # of entries in triple-indirect block CS-4513 D-term 2008 File System Implementations

49 Modern Operating Systems
Need much larger, more flexible file systems Many terabytes per system Multi-terabyte files Suitable for both large and small Cache only open files in RAM CS-4513 D-term 2008 File System Implementations

50 Examples of Modern File Systems
Windows NTFS Silbershatz §22.5 Tanenbaum §11.7 Linux ext2fs Silbershatz §21.7.2 Other file systems … Consult your favorite Linux system documentation CS-4513 D-term 2008 File System Implementations

51 File System Implementations
New Topic CS-4513 D-term 2008 File System Implementations

52 mount –t type device pathname mount –t iso9660 /dev/cdrom /myCD
Mounting mount –t type device pathname Attach device (which contains a file system of type type) to the directory at pathname File system implementation for type gets loaded and connected to the device Anything previously below pathname becomes hidden until the device is un-mounted again The root of the file system on device is now accessed as pathname E.g., mount –t iso9660 /dev/cdrom /myCD CS-4513 D-term 2008 File System Implementations

53 File System Implementations
Mounting (continued) OS automatically mounts devices in mount table at initialization time /etc/fstab in Linux Users or applications may mount devices at run time, explicitly or implicitly — e.g., Insert a floppy disk Plug in a USB flash drive Type may be implicit in device Windows equivalent Map drive CS-4513 D-term 2008 File System Implementations

54 File System Implementations
Virtual File Systems Virtual File Systems (VFS) provide object-oriented way of implementing file systems. VFS allows same system call interface to be used for different types of file systems. The API is to the VFS interface, rather than any specific type of file system. CS-4513 D-term 2008 File System Implementations

55 Schematic View of Virtual File System
CS-4513 D-term 2008 File System Implementations

56 Virtual File System (continued)
Mounting: formal mechanism for attaching a file system to the Virtual File interface CS-4513 D-term 2008 File System Implementations

57 Linux Virtual File System (VFS)
A generic file system interface provided by the kernel Common object framework superblock: a specific, mounted file system i-node object: a specific file in storage d-entry object: a directory entry file object: an open file associated with a process CS-4513 D-term 2008 File System Implementations

58 Linux Virtual File System (continued)
VFS operations super_operations: read_inode, sync_fs, etc. inode_operations: create, link, etc. d_entry_operations: d_compare, d_delete, etc. file_operations: read, write, seek, etc. CS-4513 D-term 2008 File System Implementations

59 Linux Virtual File System (continued)
Individual file system implementations conform to this architecture. May be linked to kernel or loaded as modules Linux kernel 2.6 supports over 50 file systems in official version E.g., minix, ext, ext2, ext3, iso9660, msdos, nfs, smb, … CS-4513 D-term 2008 File System Implementations

60 File System Implementations
Reading references Silbershatz, §11.2.3 Also:– Robert Love, Linux Kernel Development, 2nd edition, Chapter 12 CS-4513 D-term 2008 File System Implementations

61 File System Implementations
Questions? CS-4513 D-term 2008 File System Implementations

62 Implementation of Directories
A list of [name, information] pairs Must be scalable from very few entries to very many Name: User-friendly, variable length Any language Fast access by name Information: File metadata (itself) Pointer to file metadata block (or i-node) on disk Pointer to first & last blocks of file Pointer to extent block(s) CS-4513 D-term 2008 File System Implementations

63 File System Implementations
Very Simple Directory name1 attributes name2 attributes name3 attributes name4 attributes … … Short, fixed length names Attribute & disk addresses contained in directory MS-DOS, etc. CS-4513 D-term 2008 File System Implementations

64 Data structures containing attributes
Simple Directory i-node name1 i-node name2 name3 i-node name4 i-node Data structures containing attributes Short, fixed length names Attributes in separate blocks (e.g., i-nodes) Attribute pointers are disk addresses (or i-node numbers) Older Unix versions, MS-DOS, etc. CS-4513 D-term 2008 File System Implementations

65 More Interesting Directory
Variable length file names Stored in heap at end Modern Unix, Windows Linear or logarithmic search for name Compaction needed after Deletion, Rename attributes … … name1 longer_na me3 very_long_n ame4 name2 … CS-4513 D-term 2008 File System Implementations

66 Very Large Directories
Hash-table implementation Each hash chain like a small directory with variable-length names Must be sorted for listing CS-4513 D-term 2008 File System Implementations

67 File System Implementations
Questions? CS-4513 D-term 2008 File System Implementations


Download ppt "File System Implementations"

Similar presentations


Ads by Google