File Systems.

Slides:



Advertisements
Similar presentations
More on File Management
Advertisements

Chapter 4 : File Systems What is a file system?
File Systems.
Chapter 10: File-System Interface
File Management Chapter 12. File Management A file is a named entity used to save results from a program or provide data to a program. Access control.
Chapter 11: File System Implementation
Jonathan Walpole Computer Science Portland State University
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
File System Implementation
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Operating Systems Chapter 7-File-System File Concept Access Methods Directory Structure Protection File-System Structure Allocation Methods Free-Space.
Ceng Operating Systems
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
CS 333 Introduction to Operating Systems Class 17 - File Systems Jonathan Walpole Computer Science Portland State University.
1 Friday, July 07, 2006 “Vision without action is a daydream, Action without a vision is a nightmare.” - Japanese Proverb.
1 Course Outline Processes & Threads CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Networks, Protection and Security.
CS 333 Introduction to Operating Systems Class 19 - File System Performance Jonathan Walpole Computer Science Portland State University.
7/15/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
File Systems We need a mechanism that provides long- term information storage with following characteristics: 1.Possible to store large amount of INFO.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
File Systems (1). Readings r Silbershatz et al: 10.1,10.2,
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
1Fall 2008, Chapter 11 Disk Hardware Arm can move in and out Read / write head can access a ring of data as the disk rotates Disk consists of one or more.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
File Systems CSCI What is a file? A file is information that is stored on disks or other external media.
UNIX File and Directory Caching How UNIX Optimizes File System Performance and Presents Data to User Processes Using a Virtual File System.
CS333 Intro to Operating Systems Jonathan Walpole.
File Management Chapter 12. File Management File management system is considered part of the operating system Input to applications is by means of a file.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
1 Shared Files Sharing files among team members A shared file appearing simultaneously in different directories Share file by link File system becomes.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
File System Implementation
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Module 4.0: File Systems File is a contiguous logical address space.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
Chapter 16 File Management The Architecture of Computer Hardware and Systems Software: An Information Technology Approach 3rd Edition, Irv Englander John.
CS333 Intro to Operating Systems Jonathan Walpole.
File Systems. 2 What is a file? A repository for data Is long lasting (until explicitly deleted).
I MPLEMENTING FILES. Contiguous Allocation:  The simplest allocation scheme is to store each file as a contiguous run of disk blocks (a 50-KB file would.
CS 333 Introduction to Operating Systems Class 17 - File Systems Jonathan Walpole Computer Science Portland State University.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems File systems.
CS 3204 Operating Systems Godmar Back Lecture 21.
Chapter 6 File Systems. Essential requirements 1. Store very large amount of information 2. Must survive the termination of processes persistent 3. Concurrent.
File Systems 2. 2 File 1 File 2 Disk Blocks File-Allocation Table (FAT)
Operating Systems 1 K. Salah Module 4.0: File Systems  File is a contiguous logical address space (of related records)  Access Methods  Directory Structure.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Operating Systems Files, Directory and File Systems Operating Systems Files, Directory and File Systems.
操作系统原理 OPERATING SYSTEMS Chapter 4 File Systems 文件系统.
W4118 Operating Systems Instructor: Junfeng Yang.
Fall 2011 Nassau Community College ITE153 – Operating Systems 1 Session 5 Files.
Jonathan Walpole Computer Science Portland State University
MODERN OPERATING SYSTEMS Third Edition ANDREW S
Chapter 11: File System Implementation
FileSystems.
Filesystems.
Chapter 11: File System Implementation
CS510 Operating System Foundations
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
Directory Structure A collection of nodes containing information about all files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the.
Introduction to Operating Systems
Secondary Storage Management Brian Bershad
Chapter 16 File Management
Chapter 14: File-System Implementation
Secondary Storage Management Hank Levy
Chapter 5 File Systems -Compiled for MCA, PU
Presentation transcript:

File Systems

Outline Overview of file systems File system design Sharing files Unix file system Consistency and crash recovery Journaling file systems Log-structured file systems

What is a File System? A file system provides an abstraction for storing, organizing and accessing persistent data I.e., data survives after process that created the data has terminated, and after machines crashes, reboot This data is stored on disks, tapes, solid-state drives (SSD) … File-system data is organized as objects called files Need a way to find files, so files have names and are organized as directories Files are accessed via system calls Files can be accessed concurrently by different processes

File Types The OS typically treats files as an unstructured sequence of bytes Programs can impose any format on files E.g., application programs may look for specific file extension to indicate the file’s type However, OS needs to understand the format of executable files to execute programs Executable file

File Metadata Files have various attributes associated with them Name, owner, creation time, access permissions, size, etc. These attributes are called file metadata File system maintains file metadata in per-file data structures on disk

Basic File-Related Calls Open Start using file, set position to beginning of file Read, Write Read/Write n bytes from/to current position Update position Seek Move to a new position Allows random access (mainly for disks, not tape) Close Stop using file Create, Rename, Delete, Get/Set attributes

Directories Provide a method for naming and locating a file Store a list of directory entries that point to files Modern systems use hierarchical directories A directory contains files or sub-directories E.g., B contains entries for D, j and E Files are accessed with pathnames Absolute pathname E.g., cat /B/D/n Relative pathname Uses current directory E.g., cd /B/D; cat n Directory metadata is similar to files Interior node / A B C i D j E F k l m n G H Leaf node o p q

Basic Directory-Related Calls Open Readdir Read entries in a directory Each entry points to a file or sub-directory Seekdir Simulated at user level Close Create, Rename, Delete, Get/Set attributes No Writedir! Link, Unlink Add/remove a name for an existing file (more later)

Outline Overview of file systems File system design Sharing files Unix file system Consistency and crash recovery Journaling file systems Log-structured file systems

File System Design OS needs to store and retrieve files and directories Needs to maintain information about where they are stored Needs to store files durably, i.e., ensure that files exist after machine reboot Needs to handle machine crashes On a crash, OS stops suddenly, perhaps in the middle of a file system operation On restart, the file system should be able to recover data and bring the file system back to a good or consistent state

Disk Blocks Disks are accessed at the granularity of sectors Typically, 512 bytes A file system allocates data in chunks called blocks The file system treats the disk as an array of blocks A block contains 2n contiguous sectors Reduces overhead of managing individual bytes Large blocks improve throughput but increase internal fragmentation

File System Tasks A file system performs four main tasks Free block management Allocates blocks to a file, manages free blocks Uses bitmaps, linked list, B-trees Issues similar to memory, swap management Block allocation and placement Maps (potentially non-contiguous) blocks to the file Issues similar to virtual memory, placement unique to disks Directory management Maps file names to location of starting block of file Buffer cache management Caches disk blocks in memory to minimize I/O

Free Block Management - Bitmaps Keep a bitmap in a separate area on disk 1 bit per disk block Suppose block size = 4 KB, disk size = 160 GB Nr. of blocks = 40 M Need 40 Mbits = 5 MB disk space => 1280 bitmap blocks Advantages Allows allocating contiguous blocks to a file easily Need only one bitmap block in memory at a time Disadvantages Need extra space for bitmap

Block Allocation and Placement Maps, potentially non-contiguous, blocks to the file Options Contiguous allocation Linked list allocation File allocation table (FAT) I-node based allocation

Contiguous Allocation All blocks in a file are contiguous on the disk After deleting D, F

When to Use Contiguous Allocation Advantages Performance is good for sequential reading Disadvantages File growth requires copying Disk becomes fragmented after deletion Will need periodic compaction Good for CD-ROMs All file sizes are known in advance Files are never deleted

Linked List Allocation Each file is a linked list of blocks First word in a block contains number of next block Disadvantage Random access are slow

File Allocation Table (FAT) Keep linked list information in memory Uses an index table with one entry per disk block Each entry contains the address of the next block Advantages Random access needs in-memory search (fast) Disadvantages Entire table stored in memory, doesn’t scale with large file systems End of file marker

Inode Based Allocation Linked list allocation spreads index information on disk, slowing random access FAT keeps linked-list index information in memory but that limits size of file system Idea Store index information for locating file blocks close together on disk Cache this information in memory when file is opened This approach avoids the problems above Problem with the idea The index information may grow with file growth It cannot be stored contiguously

Inode Based Allocation Use a tree to store index information Tree structure allows growth of index information, without spreading this information too much Root of tree is called inode (index node) Inode is stored on disk There is one inode per file or directory

Inode Structure Twelve direct block pointers One indirect pointer Point directly to file data blocks (called direct or data blocks) One indirect pointer Points to an indirect block that contains pointers to direct blocks One double indirect pointer Points to a double indirect block that contains pointers to indirect blocks One triple indirect pointer Points to a triple indirect block that contains pointers to double indirect blocks (not shown below) allocation strategy: it optimizes for small files. Small files can be accessed without requiring additional disk IO for reading indirect blocks. Why this allocation strategy?

Maximum File Size Say block size is 4 KB Say block pointer size is 4 bytes So 1024 block pointers per block Total number of blocks in the file 12 direct blocks 1024 blocks via indirect block pointer 1024 * 1024 blocks via double indirect block pointer 1024 * 1024 * 1024 blocks via triple indirect block pointer Total file size (12 + 1024 + 10242 + 10243) * 4KB ≈ 2 10*3 * 4 * 210 = 242 = 4 TB

Unix File System Layout super block: maintains file system wide information such as partition size, where the different regions (e.g. inode bitmap, block bitmap) start and end. inode bitmap: maintains allocation status of inodes. block bitmap: maintains allocation status of blocks in the “file and directory blocks area” inode blocks: maintains an array of inodes. Each file or directory is associated with an inode. file and directory blocks: all data and indirect blocks for files and directories are located here. Unix File System

Block Placement Block placement is the policy used by file system for block allocation Original Unix file system had two placement problems Data blocks allocated randomly in “aging” file systems Blocks for file allocated sequentially when file system is new As file system fills, blocks are allocated from deleted files Deleted files may be randomly placed So, blocks for new files become scattered across disk Inodes allocated far from blocks All inodes at beginning of disk, far from data Traversing file name paths, manipulating files, directories requires going back and forth from inodes to data blocks Both of these problems generate many long seeks

BSD Fast File System BSD Unix redesigned Unix FS New FS called Fast File System (FFS) Disk partitioned into groups of cylinders Recall, cylinder is the same track across platters Cylinder group consists of contiguous cylinders Placement policy: place these in same cylinder group Inode, data blocks in a file Files in a directory If cylinder group is full, place in nearby group The reason the super block is placed in all the cylinder groups is for reliability. If the first one gets damaged, due to sector failure, another one can be used. This is really important because without the super block, the entire file system cannot be accessed. The super blocks are placed in an offset manner so that a head failure across the same region of the cylinder does not simultaneously damage all super blocks.

Directory Management A directory contains zero or more entries One entry per file or sub-directory that resides in the directory Directory entries are kept in directory data blocks Entry maps file names to location of starting block, has File name, file attributes Block number of first block of the file Data blocks Kernel.C attributes Block Nr. Kernel.h os …

Unix Directories In Unix, each entry has (File name, Inode number) Inode number helps locate i-node of the file Inode contains file attributes Note that the inode is located in the inode blocks area

Unix Directories Hard links Symbolic links (short cuts) More than one name for a file Different directory entries have the same inode number /C/F/r points to the same inode as /B/D/n Inodes maintains reference count Dag, instead of tree structure Symbolic links (short cuts) A file contains data naming another file (a redirect) The file contents of /C/F/G/s are /B/D/m Interior node / A B C i D j E F k l m n r G H Leaf node s o p q

File Names Short, Fixed Length Names MS-DOS/Windows Original Unix FILE3.BAK (8+3) Name has 11 bytes Original Unix Name has 14 bytes

File Names Variable Length Names E.g., Unix Options Each name can be 4096 bytes Size of directory entry is variable Options Entries are allocated contiguously Each entry has length of entry and then name of file name Fragmentation occurs when files are removed Allocate set of pointers to file names in the beginning of the directory Use heap at the end to store names

File Deletion Directory entry is removed from directory All blocks in file are returned to free list Hard Links Put a “reference count” field in each inode Counts number of entries that point to the file When removing file from directory, decrement count When count goes to zero, reclaim all file blocks Symbolic Link Remove the real file Symbolic link is “broken” Similar to a bad URL

Path Lookup in Unix FS Say File F located in directory /D1/D2 has to be read What blocks need to be read from disk? Note that at least two blocks are read for each path component 2 for / 2 for D1 2 for D2 2 for F

Path Lookup in Unix FS Say File F located in directory /D1/D2 has to be read What blocks need to be read from disk? Super block (provides location of inode blocks area) Normally this block is read when a file system code performs initialization and this block is cached in memory Inode of the / directory (from the inode blocks area) Data blocks of / directory (provides directory entry for D1) Inode of the D1 directory Data blocks of D1 directory (provides directory entry for D2) Inode of the D2 directory Data blocks of the D2 directory (provides directory entry for F) Inode of F file Data blocks of F File Note that at least two blocks are read for each path component 2 for / 2 for D1 2 for D2 2 for F

Buffer Cache Management Notice each file access requires many block accesses File operations often access the same disk block E.g., block containing contents of root (/) directory Caching disk blocks in memory can reduce disk I/O Traditionally block cache is called a buffer cache Cache operations Block lookup If block in memory, returns data from buffer Block miss Read disk block into buffer, update buffer cache Block flush If buffer is modified, write it back to disk block

Buffer Cache Organization Many blocks can be cached in memory With 16GB machine, say 8GB for buffer cache Block size = 4K, nr. of blocks cached = 2M Use a hash table to lookup block in memory efficiently key Device Block # Disk blocks in memory

Buffer Cache Write Policy When an application writes to a file, the corresponding block is updated in the buffer cache When is the disk block updated? Immediately (synchronously) Write-through cache Correct, but very slow Later (asynchronously) Write-back cache Fast, but what if system crashes? File system can become inconsistent because some blocks in memory are not on disk We discuss this problem in detail later

Buffer Cache Issues Buffer cache typically has limited size, so we need replacement algorithms Typically, LRU is used Buffer cache competes with virtual memory system How many frames to allocate for buffer cache vs. virtual memory? Some systems use a unified memory cache for buffer cache and virtual memory pages The blocks of the buffer cache and pages in the page cache are part of a unified caching scheme However, if a program reads a large file, then it affects programs that are not accessing files much

Read Ahead Applications often read files sequentially File system can predict that a process will request a file block after the one that is requesting File system prefetches next block from disk Also, called read ahead Note that the next block may not be allocated sequentially If process requests next block, it will be in cache Allows operlapping IO with execution

Outline Overview of file systems File system design Sharing files Unix file system Consistency and crash recovery Journaling file systems Log-structured file systems

Sharing Files Files need to be shared across processes Issues Concurrent access What happens when threads read and write a file simultaneously? Protection How should the OS ensure that only an authorized user can access a file?

Concurrent Access OS ensures sequential consistency A read() call sees data from most recent finished write() call (even if write occurred on another processor) All processors see same order of writes, i.e., if a file block has value 1, followed by 2, then no processor will read 2, 1 Applications still have to ensure read called after write has finished Concurrent accesses may see old and new data

Protection Who (subject) can access a file (object)? How can they access it (action)? A protection system dictates whether a given action performed by a given subject on an object is allowed Actions include, read, write, execute, append, change protection, delete, etc. Two mechanisms for enforcing protection Access control lists (ACL) Capabilities

Access Control Lists (ACL) For each object, maintain list of subjects and their permitted actions Easier to manage Easy to grant, revoke Problem when objects are heavily shared ACLs become large Use groups

Capabilities For each subject, maintain list of objects and their permitted actions Easier to transfer Like keys, can handoff, does not depend on subject Revoking capability is challenging Need to keep track of all subjects that have capability

ACLs vs. Capabilities Objects Subjects Capability ACL /one /two /three Alice rw - Bob w r Charlie Subjects Capability ACL

Outline Overview of file systems File system design Sharing files Unix file system Consistency and crash recovery Journaling file systems Log-structured file systems

The UNIX File-System Data Structures In memory On disk, cached in memory

Example: Unix File-Related System Calls fd = open(name, mode) Perform path lookup on name to find inode of file Cache inode for file in buffer cache Check permissions Set up entry in open file table Set up entry in file descriptor table Return fd byte_count = read(fd, buffer, buffer_size) Figure out data (and indirect) block(s) to read Read them from disk into buffer cache Copy data to user buffer Update file position Return number of bytes read

Example: Unix File-Related System Calls byte_count = write(fd, buffer, num_bytes) Figure out data (and indirect) block(s) to write Read them from disk into buffer cache if the block is being partially updated Copy data from user buffer into buffer cache blocks Update i-node Mark modified buffers, such as inode, free maps, indirect, and data blocks, as dirty Schedule writing dirty buffers to disk Update file position Return number of bytes written close(fd) Reclaim resources

Summary File systems are designed to store data durably and reliably in file A file is an abstraction for a disk An application thinks of a file as contiguous byte array File system maps file to non-contiguous blocks A file system performs four main tasks Free block management Block allocation and placement Directory management Buffer cache management

Think Time What is the purpose of directories in a file system? What operations update directories? In Unix, the directory hierarchy forms an acyclic graph. Explain how. How are cycles not allowed in the graph? Why are they not allowed? What are the benefits/drawbacks of using inodes in a Unix file system vs. the FAT file system? Describe the difference between hard and symbolic links in Unix? purpose of directories: used for naming files, sub-directories update directories: file/dir create, rename file/dir, delete file/dir, change attributes of dir acyclic graph: acyclic graph due to hard links how are cycles are not allowed: a hard link can only be created to files, that are leaves in the file system tree. A hard link cannot be created to a directory, and hence cycles cannot form. why are cycles not allowed: recall that directories need to be empty before they can be removed. if cycles were allowed, it would not be possible to remove a directory that was part of a cycle, unless an expensive cycle detection check was performed to find out that all the directories in the cycle were empty and could be removed. benefits/drawback of inodes: benefits: allows scaling to much larger file systems, allows hard links, drawbacks: FAT is more efficient for smaller file systems difference between hard and symbolic links: a hard link allows multiple directory entries to point to the same inode. a symbolic link is a directory entry that points to a special file, that contains the path of another file.

Think Time What were the problems in the Unix file system that led to the FFS design? Describe the operations needed to write the string "xyz" to an existing file "/a/b" in Unix FFS design: files became fragmented and were too spread out on disk, inodes and file data were far apart Operations needed to write “xyz” read super block (if not in memory) read inode of /, directory block of / read inode of a, directory block of a read inode of b If first block of b exists (it may not exist if file is zero sized) read block into buffer cache else allocate block on disk (update block bitmap) allocate block in buffer cache update block in buffer cache with the string “xyz”, mark it dirty update file position update inode of b with new timestamp, file size, etc. schedule writing of file block, inode, block bitmap to disk

Mounting File Systems We say that a container holds a set of items A namespace is the set of unique names for all items in a container A file system namespace is the set of names for all files in a file system File system mounting glues a file system namespace into the namespace of another file system Provides a unified namespace

Mounting File Systems - Example Each device/partition stores a single file system Device or partition can be local or remote A mount point is an empty directory, say A, in existing namespace After mounting new file system is available at A / / FS 2 FS 1 mnt A A