Outline File Management Structured files Low-level file implementations
Operating System Components 11/29/2018 COP4610
Why Programmers Need Files Persistent storage Shared device HTML Editor <head> … </head> <body> </body> Web Browser Structured information Can be read by any application Accessibility Protocol foo.html File Manager 11/29/2018 COP4610
File system context 11/29/2018 COP4610
Fig 13-2: The External View of the File Manager Application Program CreateFile() ReadFile() CloseHandle() SetFilePointer() WriteFile() mount() write() open() close() read() lseek() File Mgr Device Mgr Memory Mgr Process Mgr File Mgr Device Mgr Memory Mgr Process Mgr UNIX Windows Hardware 11/29/2018 COP4610
Levels in a file system 11/29/2018 COP4610
Information Structure 11/29/2018 COP4610
Logical structures in a file 11/29/2018 COP4610
Low-Level Files 11/29/2018 COP4610
File systems File system A data structure on a disk that holds files actually a file system is in a disk partition a technical term different from a “file system” as the part of the OS that implements files File systems in different OSs have different internal structures 11/29/2018 COP4610
A file system layout 11/29/2018 COP4610
File system descriptor The data structure that defines the file system Typical fields size of the file system (in blocks) size of the file descriptor area first block in the free block list location of the file descriptor of the root directory of the file system times the file system was created, last modified, and last used 11/29/2018 COP4610
File system layout variations MS/DOS uses a FAT (file allocation table) file system so does the Macintosh OS (although the MacOS layout is different) New UNIX file systems use cylinder groups (mini-file systems) to achieve better locality of file data 11/29/2018 COP4610
Locating file data The logical file is divided into logical blocks Each logical block is mapped to a physical disk block The file descriptor contains data on how to perform this mapping there are many methods for performing this mapping we will look at several of them 11/29/2018 COP4610
Dividing a file into blocks 11/29/2018 COP4610
Contiguous Allocation Each file occupies a set of contiguous blocks on the disk Simple – only starting location and length are required Random access Wasteful of space (dynamic storage-allocation problem) Files cannot grow Mapping from logical to physical Block to be accessed = Q + starting address Displacement into block = R R LA/512 Q 11/29/2018 COP4610
A contiguous file 11/29/2018 COP4610
A contiguous file – cont. 11/29/2018 COP4610
Keeping a file in pieces We need a block pointer for each logical block, an array of block pointers block mapping indexes into this array Each file is a linked list of disk blocks But where do we keep this array? usually it is not kept as contiguous array the array of disk pointers is like a second related file (that is 1/1024 as big) 11/29/2018 COP4610
Block pointers in the file descriptor 11/29/2018 COP4610
Block pointers in contiguous disk blocks 11/29/2018 COP4610
Block pointers in the blocks 11/29/2018 COP4610
Block pointers in the blocks – cont. 11/29/2018 COP4610
Block pointers in an index block 11/29/2018 COP4610
Block pointers in an index block – cont. 11/29/2018 COP4610
Chained index blocks 11/29/2018 COP4610
Two-level index blocks 11/29/2018 COP4610
Two-level index blocks – cont. primary index secondary index table data blocks 11/29/2018 COP4610
The UNIX hybrid method 11/29/2018 COP4610
The UNIX hybrid method – cont. 11/29/2018 COP4610
Inverted disk block index (FAT) 11/29/2018 COP4610
DOS FAT Files … File Descriptor File Access Table (FAT) 43 254 107 Disk Block File Descriptor … 43 107 254 File Access Table (FAT) 11/29/2018 COP4610
Free-Space Management Bit vector (n blocks) 1 2 n-1 … 1 block[i] free 0 block[i] occupied bit[i] = First free block number (number of bits per word) * (number of 0-value words) + offset of first 1 bit 11/29/2018 COP4610
Free-Space Management - cont. Bit map requires extra space. Example: block size = 212 bytes disk size = 230 bytes (1 gigabyte) n = 230/212 = 218 bits (or 32K bytes) Easy to get contiguous files Linked list (free list) Cannot get contiguous space easily No waste of space 11/29/2018 COP4610
Free list organization 11/29/2018 COP4610
Free-Space Management - cont. Need to protect: Pointer to free list Bit map Must be kept on disk Copy in memory and disk may differ. Cannot allow for block[i] to have a situation where bit[i] = 0 in memory and bit[i] = 1 on disk. Solution: Set bit[i] = 0 in disk. Allocate block[i] Set bit[i] = 0 in memory 11/29/2018 COP4610
Implementing Low Level Files Secondary storage device contains: Volume directory (sometimes a root directory for a file system) External file descriptor for each file The file contents Manages blocks Assigns blocks to files (descriptor keeps track) Keeps track of available blocks Maps to/from byte stream 11/29/2018 COP4610
Disk Organization … … … … … … … … Boot Sector Volume Directory Blk0 Blk1 … Blkk-1 Track 0, Cylinder 0 … Blkk Blkk+1 Blk2k-1 Track 0, Cylinder 1 … … Blk Blk Blk Track 1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder 0 … … Blk Blk Blk Track N-1, Cylinder M-1 11/29/2018 COP4610
Low-level File System Architecture Block 0 b0 b1 b2 b3 … … bn-1 . . . Sequential Device Randomly Accessed Device 11/29/2018 COP4610
File Descriptors External name Current state Sharable Owner User Locks Protection settings Length Time of creation Time of last modification Time of last access Reference count Storage device details 11/29/2018 COP4610
An open() Operation Locate the on-device (external) file descriptor Extract info needed to read/write file Authenticate that process can access the file Create an internal file descriptor in primary memory Create an entry in a “per process” open file status table Allocate resources, e.g., buffers, to support file usage 11/29/2018 COP4610
File Manager Data Structures Keep the state of the process-file session 2 Copy info from external to the open file descriptor 1 Open File Descriptor Process-File Session 3 Return a reference to the data structure External File Descriptor 11/29/2018 COP4610
Opening a UNIX File On-Device File Descriptor Open File Table fid = open(“fileA”, flags); … read(fid, buffer, len); 0 stdin 1 stdout 2 stderr 3 ... Open File Table File structure inode Internal File Descriptor On-Device File Descriptor 11/29/2018 COP4610
Reading and Writing the Byte Stream Two stages Reading bytes into or writing bytes out of the memory copy of the block Reading the physical blocks into or writing them out of memory from/to storage devices Packing or unmarshalling procedure converts secondary storage blocks into a byte stream Unpacking or marshalling procedure converts a byte stream into blocks 11/29/2018 COP4610
Marshalling the Byte Stream Must read at least one buffer ahead on input Must write at least one buffer behind on output Seek flushing the current buffer and finding the correct one to load into memory Inserting/deleting bytes in the interior of the stream 11/29/2018 COP4610
Full Block Buffering Storage devices use block I/O Files place an explicit order on the bytes Therefore, it is possible to predict what is likely to be read after a byte When file is opened, manager reads as many blocks ahead as feasible After a block is logically written, it is queued for writing behind, whenever the disk is available Buffer pool – usually variably sized, depending on virtual memory needs Interaction with the device manager and memory manager 11/29/2018 COP4610
Supporting Other Storage Abstractions Low-level file systems avoid encoding record-level functionality If applications use very large or very small records, a generic file manager may not be efficient Some operating systems provide a higher-layer file system to support applications with large or small files Database management systems and multimedia documents are examples 11/29/2018 COP4610
Structured Files 11/29/2018 COP4610
Record-Oriented Sequential Files 11/29/2018 COP4610
Electronic Mail Example 11/29/2018 COP4610
Indexed Sequential Files 11/29/2018 COP4610
Database Management Systems A database is a very highly structured set of information Stored across different files Optimized to minimize access time DBMSs implementation Some DBMSs use the normal files provided by the OS for generic use Some use their own storage device block 11/29/2018 COP4610
Disk compaction 11/29/2018 COP4610
Memory-mapped Files A file’s contents are mapped directly into the virtual address space Files can be read from or written to by referencing the corresponding virtual addresses Memory-mapped files are very useful when a file is shared or accessed repeatedly 11/29/2018 COP4610
Memory-mapped Files – cont. 11/29/2018 COP4610
Directories A directory is a set of logically associated files and other directories of files Directories are the mechanism we use to organize files The file manager provides a set of commands to manage directories Traverse a directory Enumerate a list of all files and nested directories 11/29/2018 COP4610
Directory Structures How should files be organized within directory? Flat name space All files appear in a single directory Hierarchical name space Directory contains files and subdirectories Each file/directory appears as an entry in exactly one other directory -- a tree Popular variant: All directories form a tree, but a file can have multiple parents. 11/29/2018 COP4610
Directory Structures 11/29/2018 COP4610
Directory Structures – cont. 11/29/2018 COP4610
A directory tree 11/29/2018 COP4610
Directory Implementation Device Directory A device can contain a collection of files Easier to manage if there is a root for every file on the device -- the device root directory File Directory Typical implementations have directories implemented as a file with a special format Entries in a file directory are handles for other files (which can be files or subdirectories) 11/29/2018 COP4610
Directory Implementation Linear list of file names with pointer to the data blocks. simple to program time-consuming to execute Hash Table – linear list with hash data structure. decreases directory search time collisions – situations where two file names hash to the same location fixed size 11/29/2018 COP4610
Mounting file systems Each file system has a root directory We can combine file systems by mounting that is, link a directory in one file system to the root directory of another file system This allows us to build a single tree out of several file systems This can also be done across a network, mounting file systems on other machines 11/29/2018 COP4610
UNIX mount Command FS mount FS at foo / bin usr etc foo bill nutt abc blah cde xyz FS mount FS at foo 11/29/2018 COP4610
Mounting a file system 11/29/2018 COP4610
VFS-based File Manager File System Independent Part of File Manager Exports OS-specific API Virtual File System Switch MS-DOS Part of File Manager ISO 9660 Part of ext2 Part of … 11/29/2018 COP4610
NFS Architecture 11/29/2018 COP4610
Summary of File Storage Methods Contiguous files Interleaved files File pointers in the file descriptor Contiguous file pointers Chained data blocks Chained single index blocks Double index blocks Triple index blocks Hybrid solutions 11/29/2018 COP4610