CPS110: I/O and file systems Landon Cox April 8, 2008.

Slides:



Advertisements
Similar presentations
Chapter 12: File System Implementation
Advertisements

Chapter 4 : File Systems What is a file system?
File Systems.
CS-3013 & CS-502, Summer 2006 More on File Systems1 More on Disks and File Systems CS-3013 & CS-502 Operating Systems.
COS 318: Operating Systems File Layout and Directories
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
Chapter 11: File System Implementation
File System Implementation
File System Implementation
Crash recovery All-or-nothing atomicity & logging.
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Ceng Operating Systems
Chapter 12: File System Implementation
Crash recovery All-or-nothing atomicity & logging.
The Google File System.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
File Systems and Disk Management. File system Interface between applications and the mass storage/devices Provide abstraction for the mass storage and.
CS 346 – Chapter 12 File systems –Structure –Information to maintain –How to access a file –Directory implementation –Disk allocation methods  efficient.
CS 162 Section Lecture 8. What happens when you issue a read() or write() request?
Distributed File Systems
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
10/28/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
1 Shared Files Sharing files among team members A shared file appearing simultaneously in different directories Share file by link File system becomes.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
File System Implementation
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Module 4.0: File Systems File is a contiguous logical address space.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
CE Operating Systems Lecture 17 File systems – interface and implementation.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
Chapter 11 – File-System Implementation (Pgs )
CS333 Intro to Operating Systems Jonathan Walpole.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
File Systems 2. 2 File 1 File 2 Disk Blocks File-Allocation Table (FAT)
Operating Systems 1 K. Salah Module 4.0: File Systems  File is a contiguous logical address space (of related records)  Access Methods  Directory Structure.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
CPS110: I/O and file systems Landon Cox April 15, 2009.
Transactions and Reliability Andy Wang Operating Systems COP 4610 / CGS 5765.
Lecture 20 FSCK & Journaling. FFS Review A few contributions: hybrid block size groups smart allocation.
Review CS File Systems - Partitions What is a hard disk partition?
CPS110: I/O and file systems Landon Cox April 3, 2008.
File Systems May 12, 2000 Instructor: Gary Kimura.
W4118 Operating Systems Instructor: Junfeng Yang.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
File Systems and Disk Management
File-System Management
File System Consistency
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Free Transactions with Rio Vista
Jonathan Walpole Computer Science Portland State University
Transactions and Reliability
Chapter 11: File System Implementation
Local file systems Landon Cox March 31, 2017.
Journaling File Systems
Introduction to Operating Systems
Chapter 11: File System Implementation
Overview Continuation from Monday (File system implementation)
Free Transactions with Rio Vista
Printed on Monday, December 31, 2018 at 2:03 PM.
Overview: File system implementation (cont)
CSE 542: Operating Systems
Presentation transcript:

CPS110: I/O and file systems Landon Cox April 8, 2008

Virtual/physical interfaces Hardware OS Applications

Multiple updates and reliability  Reliability is only an issue in file systems  Don’t care about losing address space after crash  Your files shouldn’t disappear after a crash  Files should be permanent  Multi-step updates cause problems  Can crash in the middle

Multi-step updates  Transfer $100 from Melissa’s account to mine 1.Deduct $100 from Melissa’s account 2.Add $100 to my account  Crash between 1 and 2, we lose $100

Multiple updates and reliability  Seems obvious, right?  No modern OS would make this mistake, right?  Video evidence suggests otherwise  Directory with 3 files  Want to move them to external drive  Drive “fails” during move  Don’t want to lose data due to failure  Roll film …

The lesson? 1.Building an OS is hard. 2.OSes are software, not religions (one is not morally superior to another) 3.“Fanboys are stupid, but you are not.” (Anil Dash, dashes.com/anil)

Multi-step updates  Back to business …  Move file from one directory to another 1.Delete from old directory 2.Add to new directory  Crash between 1 and 2, we lose a file “/home/lpcox/names”“/home/chase/names”

Multi-step updates  Create an empty new file 1.Point directory to new file header 2.Create new file header  What happens if we crash between 1 and 2?  Directory will point to uninitialized header  Kernel will crash if you try to access it  How do we fix this?  Re-order the writes

Multi-step updates  Create an empty new file 1.Create new file header 2.Point directory to new file header  What happens if we crash between 1 and 2?  File doesn’t exist  File system won’t point to garbage

Multi-step updates  What if we also have to update a map of free blocks? 1.Create new file header 2.Point directory to new file header 3.Update the free block map  Does this work?  Bad if crash between 2 and 3  Free block map will still think new file header is free

Multi-step updates  What if we also have to update a map of free blocks? 1.Create new file header 2.Update the free block map 3.Point directory to new file header  Does this work?  Better, but still bad if crash between 2 and 3  Leads to a disk block leak  Could scan the disk after a crash to recompute free map  Older versions of Unix and Windows do this  (now we have journaling file systems …)

Careful ordering is limited  Transfer $100 from Melissa’s account to mine 1.Deduct $100 from Melissa’s account 2.Add $100 to my account  Crash between 1 and 2, we lose $100  Could reverse the ordering 1.Add $100 to my account 2.Deduct $100 from Melissa’s account  Crash between 1 and 2, we gain $100  What does this remind you of?

Atomic actions  A lot like pre-emptions in a critical section  Race conditions  Allow threads to see data in inconsistent state  Crashes  Allow OS to see file system in inconsistent state  Want to make actions atomic  All operations are applied or none are

Atomic actions  With threads  Built larger atomic operations  (lock, wait, signal)  Using atomic hardware operations  (test and set, interrupt enable/disable)  Same idea for persistent storage  Transactions

Transactions  Fundamental to databases  (except MySQL, until recently)  Several important properties  “ACID” (atomicity, consistent, isolated, durable)  We only care about atomicity (all or nothing) BEGIN disk write 1 … disk write n END Called “committing” the transaction

Transactions  Basic atomic unit provided by the hardware  Writing to a single disk sector/block  (not actually atomic, but close)  How to make sequences of updates atomic?  Two styles  Shadowing  Logging

Transactions: shadowing  Easy to explain, not widely used 1.Update a copy version 2.Switch the pointer to the new version  Each individual step is atomic  Step 2 commits the transaction  Why doesn’t anyone do this?  Double the storage overhead

Transactions: logging 1.Begin transaction 2.Append info about modifications to a log 3.Append “commit” to log to end x-action 4.Write new data to normal database  Single-sector write commits x-action (3) Invariant: append new data to log before applying to DB Called “write-ahead logging” Begin Write1 … … WriteN Commit Transaction Complete

Transactions: logging 1.Begin transaction 2.Append info about modifications to a log 3.Append “commit” to log to end x-action 4.Write new data to normal database  Single-sector write commits x-action (3) What if we crash here (between 3,4)? On reboot, reapply committed updates in log order. Begin Write1 … … WriteN Commit

Transactions: logging 1.Begin transaction 2.Append info about modifications to a log 3.Append “commit” to log to end x-action 4.Write new data to normal database  Single-sector write commits x-action (3) What if we crash here? On reboot, discard uncommitted updates. Begin Write1 … … WriteN

Transactions  Most file systems  Use transactions to modify meta-data  Why not use them for data?  Related updates are program-specific  Would have to modify programs  OS doesn’t know how to group updates

Virtual/physical interfaces Hardware OS Applications

Naming and directories  How to locate a file?  Need to tell OS which file header  Use a symbolic name or click on an icon  Could also describe contents  Like Google desktop and Mac Spotlight  Naming in databases works this way

Name translation  User-specified file  on-disk location  Lots of possible data structures  (hash table, list of pairs, tree, etc)  Once you have the header, the rest is easy “/home/lpcox/” FS translation data (directory) FS translation data (directory)

Directories  Directory  Contains a mapping for a set of files  Name  file header’s disk block #  Often just table  pairs

Directories  Stored on disk along with actual files  Disk stores lots of kinds of data  End-user data (i.e. data blocks)  Meta-data that describes end-user data  Inodes, directories, indirect blocks  Can often treat files and directories the same  Both can use the same storage structure  E.g. multi-level indexed files  Allows directories to be larger than one block

Directories  Directories can point to each other  Name  file’s disk header block #  Name  directory’s disk header block #  Can users read/write directories like files?  OS has no control over content of data blocks  OS must control content of meta-data blocks  Why?  OS interprets meta-data, it must be well-formatted

Directories  Users still have to modify directories  E.g. create and delete files  How do we control these updates?  Where else have we seen this issue?  Use a narrow interface (e.g. createfile ())  Like using system calls to modify kernel

Directories  Typically a hierarchical structure  Directory A has mapping to files and directories  /lpcox/cps110/names  / is the root directory  /lpcox is a directory within the / directory  /lpcox/cps110 is a directory within the /lpcox directory

How many disk I/Os?  Read first byte of /lpcox/cps110/names ? 1.File header for / (at a fixed spot on disk) 2.Read first data block for / 3.Read file header for /lpcox 4.Read first data block for /lpcox 5.Read file header for /lpcox/cps110 6.Read first data block for /lpcox/cps110 7.Read file header for /lpcox/cps110/names 8.Read first data block for /lpcox/cps110/names

How many disk I/Os?  Caching is only way to make this efficient  If file header block # of /lpcox/cps110 is cached  Can skip steps 1-4  Current working directory  Allows users to specify file names instead of full paths  Allows system to avoid walking from root on each access  Eliminates steps 1-4

Mounting multiple disks  Can easily combine multiple disks  Basic idea:  Each disk has a file system (a / directory)  Entry in one directory can point to root of another FS  This is called a “mount point”  For example, on my machine crocus  /bin is part of one local disk  /tmp is part of another local disk  /afs is part of the distributed file system AFS

Mounting multiple disks  Requires a new mapping type  Name  file’s disk header block #  Name  directory’s disk header block #  Name  device name, file system type  Use drivers to walk multiple file systems  Windows: disks visible under MyComputer/{C,D,E}  Unix: disks visible anywhere in tree

Course administration  Project 3: out today, due in two weeks  Amre is the expert on these  I can help, but he is the one in charge  Discussion sections  Will focus on Project 3  Will be hard to do P3 without

Virtual/physical interfaces Hardware OS Applications

File caching  File systems have lots of data structures  Data blocks  Meta-data blocks  Directories  File headers  Indirect blocks  Bitmap of free blocks  Accessing all of these is really slow

File caching  Caching is the main thing that improves performance  Random  sequential I/O  Accessing disk  accessing memory  Should the file cache be in virtual or physical memory?  Should put in physical memory  Could put in virtual memory, but might get paged out  Worst-case: each file is on disk twice  Could also use memory-mapped files  This is what Windows does

Memory-mapped files  Basic idea: u se VM system to cache files  Map file content into virtual address space  Set the backing store of region to file  Can now access the file using load/store  When memory is paged out  Updates go back to file instead of swap space

File Read in Windows NT Application Kernel mode User mode I/O Manager Disk Driver Cache Manager map cp A->b Cache Manager map cp A->b VM Manager lookup cp file->A VM Manager lookup cp file->A Disk Read(file,b) Page Fault (A) FS Driver Disk I/O IRP(file,b) Read(file,b) NonCachedRead(file)

File caching issues  Normal design considerations for caches  Cache size, block size, replacement, etc.  Write-back or write-through?  Write-through: writes to disk immediately  Slow, loss of power won’t lose updates  Write-back: delay for a period  Fast, loss of power can lose updates  Most systems use a 30 sec write-back delay

Distributed file systems  Distributed file system  Data stored on many different computers  One, unified view of all data  Same interface to access  Local files  Remote files

Distributed file systems  Examples: AFS, NFS, Samba, web(?)  Why use them?  Easier to share data among people  Uniform view of files from different machines  Easier to manage (backup, etc)

Basic implementation  Classic client/server model ClientsServer Going over the network for data makes performance bad. Cache at the client to improve.

Caching  I’m sitting at crocus and want to access  /usr/project/courses/cps110/bin/submit110  Which is stored at linux.cs.duke.edu  What should happen to the file?  Transfer sole copy from server to client?  Make a copy of the file instead (replication)

Caching  What happens if I modify the file?  Other clients’ copy will be out of date  All copies must be kept consistent 1.Invalidate all other copies 2.Update other copies  Exam analogy: all exams must be the same  Could have everyone update their copy (broadcast)  Could make everyone throw out their copy (invalidation)

Caching  When do I give up my copy?  (invalidation)  Only when someone else modifies it

Cached file states Someone else writes the file I write the file Someone else writes the file I write the file Invalid I read the file Shared Exclusive I read the file What else from 110 does this remind you of? Reader-writer locks: allow parallel reads of shared data, one writer