CS 3204 Operating Systems Godmar Back Lecture 25.

Slides:



Advertisements
Similar presentations
4/8/14CS161 Spring FFS Recovery: Soft Updates Learning Objectives Explain how to enforce write-ordering without synchronous writes. Identify and.
Advertisements

More on File Management
Chapter 4 : File Systems What is a file system?
The google file system Cs 595 Lecture 9.
Free Space and Allocation Issues
File Systems.
File Systems Examples.
COS 318: Operating Systems File Layout and Directories
Chapter 11: File System Implementation
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Ceng Operating Systems
Operating Systems File Systems (Select parts of Ch 6)
6/24/2015B.RamamurthyPage 1 File System B. Ramamurthy.
Memory Management 1 CS502 Spring 2006 Memory Management CS-502 Spring 2006.
CS-3013 & CS-502, Summer 2006 Memory Management1 CS-3013 & CS-502 Summer 2006.
Chapter 12: File System Implementation
Crash recovery All-or-nothing atomicity & logging.
Lecture 17 FS APIs and vsfs. File and File Name What is a File? Array of bytes. Ranges of bytes can be read/written. File system consists of many files,
Operating Systems File Systems (Select parts of Ch 11-12)
7/15/2015B.RamamurthyPage 1 File System B. Ramamurthy.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
File Systems (1). Readings r Silbershatz et al: 10.1,10.2,
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
1Fall 2008, Chapter 11 Disk Hardware Arm can move in and out Read / write head can access a ring of data as the disk rotates Disk consists of one or more.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
CS 149: Operating Systems April 9 Class Meeting Department of Computer Science San Jose State University Spring 2015 Instructor: Ron Mak
CSCI-375 Operating Systems Lecture Note: Many slides and/or pictures in the following are adapted from: slides ©2005 Silberschatz, Galvin, and Gagne Some.
CS 4284 Systems Capstone Godmar Back Disks & File Systems.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
1 Comp 104: Operating Systems Concepts Files and Filestore Allocation.
1 Shared Files Sharing files among team members A shared file appearing simultaneously in different directories Share file by link File system becomes.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
By Teacher Asma Aleisa Year 1433 H.   Goals of memory management  To provide a convenient abstraction for programming.  To allocate scarce memory.
CS 4284 Systems Capstone Godmar Back Disks & File Systems.
Project 6 Unix File System. Administrative No Design Review – A design document instead 2-3 pages max No collaboration with peers – Piazza is for clarifications.
Linux+ Guide to Linux Certification, Third Edition
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
File Systems Operating Systems 1 Computer Science Dept Va Tech August 2007 ©2007 Back File Systems & Fault Tolerance Failure Model – Define acceptable.
CS333 Intro to Operating Systems Jonathan Walpole.
UNIX File System (UFS) Chapter Five.
File Systems. 2 What is a file? A repository for data Is long lasting (until explicitly deleted).
I MPLEMENTING FILES. Contiguous Allocation:  The simplest allocation scheme is to store each file as a contiguous run of disk blocks (a 50-KB file would.
CS 3204 Operating Systems Godmar Back Lecture 21.
File Systems 2. 2 File 1 File 2 Disk Blocks File-Allocation Table (FAT)
Lecture 20 FSCK & Journaling. FFS Review A few contributions: hybrid block size groups smart allocation.
Lecture Topics: 11/29 File System Interface –Files and Directories –Access Methods –Protection –Consistency.
Review CS File Systems - Partitions What is a hard disk partition?
Lecture Topics: 12/1 File System Implementation –Space allocation –Free Space –Directory implementation –Caching Disk Scheduling File System/Disk Interaction.
File System Performance CSE451 Andrew Whitaker. Ways to Improve Performance Access the disk less  Caching! Be smarter about accessing the disk  Turn.
W4118 Operating Systems Instructor: Junfeng Yang.
Chapter 39 File and Directory Chien-Chung Shen CIS/UD
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Jonathan Walpole Computer Science Portland State University
CS 5204 Operating Systems Disks & File Systems Godmar Back.
CS703 - Advanced Operating Systems
CS 5204 Operating Systems Disks & File Systems Godmar Back.
Filesystems.
Journaling File Systems
File System B. Ramamurthy B.Ramamurthy 11/27/2018.
Printed on Monday, December 31, 2018 at 2:03 PM.
File System Implementation
CSE 451 Fall 2003 Section 11/20/2003.
CS703 - Advanced Operating Systems
CS703 - Advanced Operating Systems
Internal Representation of Files
Presentation transcript:

CS 3204 Operating Systems Godmar Back Lecture 25

3/16/2016CS 3204 Spring Announcements Office Hours moved to one of McB 133, 116, or 124 Please see revised grading policy posted on websiterevised grading policy –Will provide standing grade by May 2 –Accept project 4 until May 7, 23:59pm Reading assignment: Ch 10, 11, 12

Storing Inodes Unix v7, BSD 4.3 FFS (BSD 4.4) Cylindergroups have superblock+bitmap+inode list+file space Try to allocate file & inode in same cylinder group to improve access locality I 0 I 1 I 2 I 3 I 4 ….. SuperblockRest of disk for files & directories I 0 I 1 … SB1Files …I 3 I 4 ….. Files …I 8 I 9 ….. Files …SB2SB3 CG i

Positioning Inodes Putting inodes in fixed place makes finding inodes easier –Can refer to them simply by inode number –After crash, there is no ambiguity as to what are inodes vs. what are regular files Disadvantage: limits the number of files per filesystem at creation time –Use “df –ih” on Linux to see how many inodes are used/free

Directories Need to find file descriptor (inode), given a name Approaches: –Single directory (old PCs), Two-level approaches with 1 directory per user Now exclusively hierarchical approaches: –File system forms a tree (or DAG) How to tell regular file from directory? –Set a bit in the inode Data Structures –Linear list of (inode, name) pairs –B-Trees that map name -> inode –Combinations thereof

Using Linear Lists Advantage: (relatively) simple to implement Disadvantages: –Scan makes lookup (& delete!) really slow for large directories –Could cause fragmentation (though not a problem in practice) 23multi-oom15sample.txt offset 0 inode #

Using B-Trees Advantages: –Scalable to large number of files: in growth, in lookup time Disadvantage: –Complex –Overhead for small directories

Absolute Paths How to resolve a path name such as “/usr/bin/ls”? –Split into tokens using “/” separator –Find inode corresponding to root directory (how? Use fixed inode # for root) –(*) Look up “usr” in root directory, find inode –If not last component in path, check that inode is a directory. Go to (*), looking for next comp –If last component in path, check inode is of desired type, return

Name Resolution Must have a way to scan an entire directory without other processes interfering -> need a “lock” function –But don’t need to hold lock on /usr when scanning /usr/bin Directories can only be removed if they’re empty –Requires synchronization also Most OS cache translations in “namei” cache – maps absolute pathnames to inode –Must keep namei cache consistent if files are deleted

Current Directory Relative pathnames are resolved relative to current directory –Provides default context –Every process has one in Unix/Pintos chdir(2) changes current directory –cd tmp; ls; pwd vs (cd tmp; ls); pwd lookup algorithm the same, except starts from current dir –process should keep current directory open –current directory inherited from parent

Hard & Soft Links Provides aliases (different names) for a file Hard links: (Unix: ln) –Two independent directory entries have the same inode number, refer to same file –Inode contains a reference count –Disadvantage: alias only possible with same filesystem Soft links: (Unix: ln –s) –Special type of file (noted in inode); content of file is absolute or relative pathname – stored inside inode instead of direct block list Windows: “junctions” & “shortcuts”

Filesystems Consistency & Logging

Filesystems & Fault Tolerance Failure Model –Define acceptable failures (disk head hits dust particle, scratches disk – you will lose some data) –Define which failure outcomes are unacceptable Define recovery procedure to deal with unacceptable failures: –Recovery moves from an incorrect state A to correct state B –Must understand possible incorrect states A after crash! –A is like “snapshot of the past” –Anticipating all states A is difficult

Methods to Recover from Failure On failure, retry entire computation –Not a good model for persistent filesystems Use atomic changes –Problem: how to construct larger atomic changes from the small atomic units available (i.e., single sector writes) Use reconstruction –Ensure that changes are so ordered that if crash occurs after every step, a recovery program can either undo change or complete it –proactive to avoid unacceptable failures –reactive to fix up state after acceptable failures

Sensible Invariants In a Unix-style filesystem, want that: –File & directory names are unique within parent directory –Free list/map accounts for all free objects all objects on free list are really free –All data blocks belong to exactly one file (only one pointer to them) –Inode’s ref count reflects exact number of directory entries pointing to it –Don’t show old data to applications Q.: How do we deal with possible violations of these invariants after a crash?

Crash Recovery (fsck) After crash, fsck runs and performs the equivalent of mark-and-sweep garbage collection Follow, from root directory, directory entries –Count how many entries point to inode, adjust ref count Recover unreferenced inodes: –Scan inode array and check that all inodes marked as used are referenced by dir entry –Move others to /lost+found Recomputes free list: –Follow direct blocks+single+double+triple indirect blocks, mark all blocks so reached as used – free list/map is the complement In following discussion, keep in mind what fsck could and could not fix!

Example 1: file create On create(“foo”), have to 1.Scan current working dir for entry “foo” (fail if found); else find empty slot in directory for new entry 2.Allocate an inode #in 3.Insert pointer to #in in directory: (#in, “foo”) 4.Write a) inode & b) directory back What happens if crash after 1, 2, 3, or 4a), 4b)? Does order of inode vs directory write back matter? Rule: never write persistent pointer to object that’s not (yet) persistent

Example 2: file unlink To unlink(“foo”), must 1.Find entry “foo” in directory 2.Remove entry “foo” in directory 3.Find inode #in corresponding to it, decrement #ref count 4.If #ref count == 0, free all blocks of file 5.Write back inode & directory Q.: what’s the correct order in which to write back inode & directory? Q.: what can happen if free blocks are reused before inode’s written back? Rule: first persistently nullify pointer to any object before freeing it (object=freed blocks & inode)

Example 3: file rename To rename(“foo”, “bar”), must 1.Find entry (#in, “foo”) in directory 2.Check that “bar” doesn’t already exist 3.Remove entry (#in, “foo”) 4.Add entry (#in, “bar”) This does not work, because?

Example 3a: file rename To rename(“foo”, “bar”), conservatively 1.Find entry (#i, “foo”) in directory 2.Check that “bar” doesn’t already exist 3.Increment ref count of #i 4.Add entry (#i, “bar”) to directory 5.Remove entry (#i, “foo”) from directory 6.Decrement ref count of #i Worst case: have old & new names to refer to file Rule: never nullify pointer before setting a new pointer

Example 4: file growth Suppose file_write() is called. –First, find block at offset Case 1: metadata already exists for block (file is not grown) –Simply write data block Case 2: must allocate block, must update metadata (direct block pointer, or indirect block pointer) –Must write changed metadata (inode or index block) & data Both writeback orders can lead to acceptable failures: –File data first, metadata next – may lose some data on crash –Metadata first, file data next – may see previous user’s deleted data after crash (very expensive to avoid – would require writing all data synchronously)

FFS’s Consistency Berkeley FFS (Fast File System) formalized rules for filesystem consistency FFS acceptable failures: –May lose some data on crash –May see someone else’s previously deleted data Applications must zero data out if they wish to avoid this + fsync –May have to spend time to reconstruct free list –May find unattached inodes  lost+found Unacceptable failures: –After crash, get active access to someone else’s data Either by pointing at reused inode or reused blocks FFS uses 2 synchronous writes on each metadata operation that creates/destroy inodes or directory entries, e.g., creat(), unlink(), mkdir(), rmdir() –Updates proceed at disk speed rather than CPU/memory speed