File Systems CSE 2431: Introduction to Operating Systems Reading: Chap. 11, §§12.1–12.4, §18.7, [OSC]
Contents Files Directories File Operations File System Disk Layout File Allocation
Why Files? Physical reality File system model Block oriented Physical sector #s No protection among users of the system Data might be corrupted if machine crashes File system model Byte oriented Named files Users protected from each other Robust to machine failures
File System Requirements Users must be able to: Create, modify, and delete files at will. Read, write, and modify file contents with minimal fuss about blocking, buffering, etc. Share each other's files with proper authorization Transfer information between files. Refer to files by symbolic names. Retrieve backup copies of files lost through accident or malicious destruction. See a logical view of their files without concern for how they are stored.
File Types ASCII – plain text (also Unicode/UTF-8) A Unix executable file Header: magic number, sizes, entry point, flags Text (code) Data Relocation bits Symbol table Devices Everything else in the system
So What Makes File Systems Hard? Files grow and shrink in pieces Little a priori knowledge 6 orders of magnitude in file sizes Overcoming disk performance behavior Desire for efficiency Coping with failure
File System Components Disk management Arrange collection of disk blocks into files Naming User gives file name, not track or sector number, to locate data Security Keep information secure Reliability/durability When system crashes, lose stuff in memory; we want file durability User File Naming File access Disk mgmt. Disk drivers
Contents Files Directories File Operations File System Disk Layout File Allocation
Directories in Unix Stored like regular files Logic Separates file from location in tree Files can appear in multiple places
Directory Contents Each entry is for one file: File name (symbolic name) File type indicates format of a file Location device and location Size Protection Creation, access, and modification date Owner identification
Directory Operations Maps symbolic names into logical file names Search Create file List directory Backup, archival, file migration
Single Level Directory
Problems With Single Level Directory Name clashes when More than one user Large file systems Moving files from one system to another
Two-Level Directory (1)
Two-Level Directory (2) Introduced to remove naming problems between users First level contains list of user directories Second level contains user files System files kept in separate directory or level 1 Sharing accomplished by naming other users’ files
Tree-Structured Directories (1)
Tree-Structured Directories (2) Arbitrary depth of directories Leaf nodes are files Interior nodes are directories Path name lists nodes to traverse for finding file Use absolute paths from root Use relative paths from current working directory pointer
Acyclic Graph Structured Directories (1)
Acyclic Graph Structured Directories (2) Acyclic graphs allow sharing Two users can name the same file Implemented by links - use logical names of files (file system and file) Implemented by symbolic links map pathname into a new pathname Duplicate paths complicates backup copies Need reference counts for hard links
Symbolic Links Symbolic links are different than regular links (often called hard links). Created with ln -s Can be thought of as a directory entry that points to the name of another file. Does not change link count for file When original deleted, symbolic link remains They exist because: Hard links don’t work across file systems Hard links only work for regular files, not directories dirent Contents of file symlink dirent Contents of file dirent Hard link Symbolic Link
General Graph Structured Directories (1)
General Graph Structured Directories (2) Cycles More flexible More costly Need garbage collection (circular structures) Must prevent infinite searches
Path Names
Contents Files Directories File Operations File System Disk Layout File Allocation
Relevant Definitions File descriptor (fd): Integer used to represent a file – easier than using names Metadata: Data about data - bookkeeping data used to eventually access the “real” data Open file table: System-wide list of descriptors in use
Types of Metadata Inode: index node, or a specific set of information kept about each file Two forms – on disk and in memory Directory: names and location information for files and subdirectories Note: stored in files in Unix Superblock: contains information to describe the file system, disk layout Information about free blocks/inodes on disk
Contents of an Inode Disk inode: 128 bytes on classic Unix File type, size, blocks on disk Owner, group, permissions (r/w/x) Reference count Times: creation, last access, last mod Inode generation number Padding & other stuff 128 bytes on classic Unix
Data Structures for Typical File System Process control block Open file table (systemwide) Memory Inode Disk inode Open file pointer array .
Open-file Table Information File Pointer Current file position pointer File Open Count Counter which tracks the number of file opens and closes. Why? Disk Location Information needed to locate the file on disk (in inode).
Opening A File fd = open(FileName, access) File system on disk File name lookup and authentication Copy the file metadata into the in-memory data structure, if it is not in yet Create an entry in the open file table (system wide) if there isn’t one Create an entry in PCB Link up the data structures Return a pointer to user fd = open(FileName, access) PCB Allocate & link up data structures Open file table File name lookup & authenticate Metadata File system on disk
Reading And Writing What happens when you… Read 10 bytes from a file? Write 10 bytes into an existing file? Write 4096 bytes into a file? Disk works on blocks (sectors)
Reading A Block read(fd, userBuf, size) PCB Get physical block to Open file table Get physical block to sysBuf, copy to userBuf Metadata read(device, phyBlock, size) Buffer cache Logical phyiscal Disk device driver
Contents Files Directories File Operations File System Disk Layout File Allocation
Disk Layout A possible file system layout
A Disk Layout for A File System Superblock defines a file system Size of the file system Size of the file descriptor area Free list pointer, or pointer to bitmap Location of the file descriptor of the root directory Other metadata such as permission and various times For reliability, replicate the superblock
Effects of Corruption Inode: file gets “damaged” Directory “Lose” files/directories Might get to read deleted files Free space bitmap information Two file blocks allocated to the same block Some blocks never get used Superblock Can’t figure out anything This is why we replicate the superblock How do you check for possible corruption?
Contents Files Directories File Operations File System Disk Layout File Allocation
File Allocation in Disk Space Low-level access methods depend upon disk allocation scheme used to store file data Contiguous allocation Linked list allocation Indexed allocation
Contiguous Allocation (1)
Contiguous Allocation (2) Request in advance for the size of the file Search bit map or linked list to locate a space: best fit, first fit, etc. File header First sector in file Number of sectors Pros Fast sequential access Easy random access Easy to recover in case of crash Cons External fragmentation Hard to grow files
Linked Allocation
Linked Files . . . File header points to 1st block on disk Each block points to next Example: FAT (MS-DOS) Pros Can grow files dynamically Space efficient, little fragmentation Cons Random/direct access: horrible Unreliable: losing a block means losing the rest Need some bytes to store pointers File header . . . null
Indexed Allocation
Indexed Allocation Solves external fragmentation Supports sequential, direct and indexed access Access requires at most one access to index block first. This can be cached in main memory File can be extended by rewriting a few blocks and index block Requires extra space for index block, possible wasted space Extension to big files issues
Other Forms of Indexed File Linked Link full index blocks together using last entry.
An Example of Indexed Allocation UNIX inode
Summary Files Directories File Operations File System Disk Layout File Allocation