CPS110: I/O and file systems Landon Cox April 3, 2008.

Slides:



Advertisements
Similar presentations
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
Advertisements

File Systems.
Allocation Methods - Contiguous
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
Disk Drivers May 10, 2000 Instructor: Gary Kimura.
CS 104 Introduction to Computer Science and Graphics Problems Operating Systems (4) File Management & Input/Out Systems 10/14/2008 Yang Song (Prepared.
CMPT 300: Final Review Chapters 8 – Memory Management: Ch. 8, 9 Address spaces Logical (virtual): generated by the CPU Physical: seen by the memory.
1 Outline File Systems Implementation How disks work How to organize data (files) on disks Data structures Placement of files on disk.
Disks.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
12: IO Systems1 I/O SYSTEMS This Chapter is About Processor I/O Interfaces: Connections exist between processors, memory, and IO devices. The OS must manage.
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
File Systems. Main Points File layout Directory layout.
Disk and I/O Management
1 File System Implementation Operating Systems Hebrew University Spring 2010.
File Systems and Disk Management. File system Interface between applications and the mass storage/devices Provide abstraction for the mass storage and.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Operating Systems CMPSC 473 I/O Management (4) December 09, Lecture 25 Instructor: Bhuvan Urgaonkar.
1Fall 2008, Chapter 11 Disk Hardware Arm can move in and out Read / write head can access a ring of data as the disk rotates Disk consists of one or more.
CS 153 Design of Operating Systems Spring 2015 Final Review.
CS 162 Section Lecture 8. What happens when you issue a read() or write() request?
CPS110: I/O and file systems Landon Cox April 8, 2008.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
1Fall 2008, Chapter 12 Disk Hardware Arm can move in and out Read / write head can access a ring of data as the disk rotates Disk consists of one or more.
Chapter 8 – Main Memory (Pgs ). Overview  Everything to do with memory is complicated by the fact that more than 1 program can be in memory.
CSC 322 Operating Systems Concepts Lecture - 20: by Ahmed Mumtaz Mustehsan Special Thanks To: Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
CSCI-375 Operating Systems Lecture Note: Many slides and/or pictures in the following are adapted from: slides ©2005 Silberschatz, Galvin, and Gagne Some.
10/28/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
Lecture 3 Page 1 CS 111 Online Disk Drives An especially important and complex form of I/O device Still the primary method of providing stable storage.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
12/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.
Local Filesystems (part 1) CPS210 Spring Papers  The Design and Implementation of a Log- Structured File System  Mendel Rosenblum  File System.
CPS110: Intro to memory Landon Cox February 14, 2008.
File Systems Topics Design criteria History of file systems Berkeley Fast File System Effect of file systems on programs fs.ppt CS 105 “Tour of the Black.
CPS110: I/O and file systems Landon Cox April 15, 2009.
CS399 New Beginnings Jonathan Walpole. Disk Technology & Secondary Storage Management.
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
W4118 Operating Systems Instructor: Junfeng Yang.
Journaling versus soft updates: asynchronous meta-data protection in file systems Landon Cox April 8, 2016.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 12: File System Implementation.
File Systems and Disk Management
Jonathan Walpole Computer Science Portland State University
Chapter 11: File System Implementation
FileSystems.
CS703 - Advanced Operating Systems
Operating Systems (CS 340 D)
Local file systems Landon Cox March 31, 2017.
Operating Systems (CS 340 D)
Filesystems.
Mass-Storage Structure
Lecture 45 Syed Mansoor Sarwar
Chapter 11: File System Implementation
File Systems and Disk Management
Introduction to Operating Systems
File Systems and Disk Management
File Systems and Disk Management
File Systems and Disk Management
Secondary Storage Management Brian Bershad
Persistence: hard disk drive
File Systems and Disk Management
Chapter 14: File-System Implementation
File Systems and Disk Management
Secondary Storage Management Hank Levy
CS333 Intro to Operating Systems
File Systems and Disk Management
Presentation transcript:

CPS110: I/O and file systems Landon Cox April 3, 2008

Virtual/physical interfaces Hardware OS Applications

Device independence  Problem: many different disks  Also interfaces, controllers, etc  Each has their own custom interface  How do we manage this diversity?  Add an abstraction inside the OS

Device drivers Rest of OS (can deal with device categories like network card, video card, keyboard, mouse, etc.) Rest of OS (can deal with device categories like network card, video card, keyboard, mouse, etc.) Device drivers Millions of specific devices Make life easier for the OS (still run in the kernel) OS file systems don’t have to know about specific devices

Device drivers Rest of OS (can deal with device categories like network card, video card, keyboard, mouse, etc.) Rest of OS (can deal with device categories like network card, video card, keyboard, mouse, etc.) Device drivers Millions of specific devices 70% of Linux kernel code 120,000 driver versions for XP 85% of XP failures Drivers in Windows vs Mac? In Vista, many drivers run in user space. Why?

Crashes in Vista

Other abstractions we’ve seen  Abstractions simplify things for their users  Threads  Programmers don’t have to worry about sharing the CPU  Address spaces  Applications don’t have to worry about sharing phys mem  TCP  Network code needn’t worry about unreliable network  Device drivers  Parts of OS don’t have to worry about device variations

Virtual/physical interfaces Hardware OS Applications

Block-oriented vs byte-oriented  Disks are accessed in terms of blocks  Also called sectors  Similar idea to memory pages  E.g. 4KB chunks of data  Programs deal with bytes  E.g. change ‘J’ in “Jello world” to ‘H’

Block-oriented vs byte-oriented  How to read less than a block?  Read entire block  Return the right portion  How to write less than a block?  Read entire block  Modify the right portion  Write out entire block  Nothing analogous to byte-grained load/store  Flash devices are even more complicated

Disk geometry

Surface organized into tracks

Parallel tracks form cylinders

Tracks broken up into sectors

Disk head position

Rotation is counter-clockwise

About to read a sector

After reading blue sector

Red request scheduled next After BLUE read

Seek to red’s track SEEK After BLUE readSeek for RED

Wait for red sector to reach head ROTATESEEK After BLUE readSeek for REDRotational latency

Read red sector After BLUE readSeek for REDRotational latencyAfter RED read SEEKROTATE

To access a disk 1.Queue (wait for disk to be free)  0-infinity ms 2.Position disk head and arm  Seek + rotation  0-12 ms  Pure overhead 3.Access disk data  Size/transfer rate (e.g. 49 MB/s)  Useful work

Goals for optimizing performance  Disk is really slow  In the time to do a disk seek (~6 ms)  3GHz processor executes 18 million cycles!  Efficiency = transfer time/(positioning+transfer time) a)Best option is to eliminate disk I/O (caching) b)Then try to minimize positioning time (seeks) c)Then try to amortize positioning overhead

Goals for optimizing performance  For each re-positioning, transfer a lot of data  True of any storage device  Particularly true for disks  Fast transfer rate, long positioning time  For Seagate Barracuda LW  Can get 50% efficiency  Transfer 6*49 = 294 KB after 6 ms seek

How to reduce positioning time  Can optimize when writing to disk  Try to reduce future positioning time  Put items together that will be accessed at the same time  How can we know this?  Use general usage patterns  Blocks in same files usually accessed together  Files in a directory tend to be accessed together  Files tend to be accessed with the containing directory  Learn from past accesses to specific data  Can also optimize when reading from disk

Disk scheduling  Idea  Take a bunch of accesses  Re-order them to reduce seek time  Kind of like traveling salesman problem  Can do this either in OS or on disk  OS knows more about the requests  Disk knows more about disk geometry

First-come, first-served (FCFS)  E.g. start at track 53  Other queued requests  98, 183, 37, 122, 14, 124, 65, 67  Total head movement  640 tracks  Average 80 tracks per seek

Shortest seek time first (SSTF)  E.g. start at track 53  Other requests  53, 65, 67, 37, 14, 98, 122, 124, 183  Total head movement  236 tracks  Average 30 tracks per seek Any problems with this? Starvation

Course administration  Project 3  Out on Tuesday (April 8 th )  Due two weeks later (April 22 nd )  Other upcoming dates  Final lecture on April 22 nd  Last day of classes is April 23 rd  Final exam will be Friday, May 2, 7p-10p  (Amre will proctor)

Virtual/physical interfaces Hardware OS Applications

File systems  What is a file system?  OS abstraction that makes disks easy to use  File system issues 1.How to map file space onto disk space  Structure and allocation: like mem management 2.How to use names instead of sectors  Naming and directories: not like memory  But very similar to DNS

Intro to file system structure  Overall question:  How do we organize things on disk?  Really a data structure question  What data structures do we use on disk?

Intro to file system structure  Need an initial object to get things going  In file systems, this is a file header  Unix: this is called an inode (indexed node)  File header contains info about the file  Size, owner, access permissions  Last modification date  Many ways to organize headers on disk  Use actual usage patterns to make good decisions

File system usage patterns 1.80% of file accesses are reads (20% writes)  Ok to save on reads, even if it hurts writes 2.Most file accesses are sequential and full  Form of spatial locality  Put sequential blocks next to each other  Can pre-fetch blocks next to each other 3.Most files are small 4.Most bytes are consumed by large files

1) Contiguous allocation  Store a file in one contiguous segment  Sometimes called an “extent”  Reserve space in advance of writing it  User could declare in advance  If grows larger, move it to a place that fits

1) Contiguous allocation  File header contains  Starting location (block #) of file  File size (# of blocks)  Other info (modification times, permissions)  Exactly like base and bounds memory

1) Contiguous allocation  Pros  Fast sequential access  Easy random access  Cons  External fragmentation  Internal fragmentation  Hard to grow files

2) Indexed files  File header  Looks a lot like a page table File block #Disk block #

2) Indexed files  Pros  Easy to grow (don’t have to reserve in advance)  Easy random access  Cons  How to grow beyond index size?  Sequential access may be slow. Why?  May have to seek after each block read Why wasn’t sequential access a problem with page tables? Memory doesn’t have seek times.

2) Indexed files  Pros  Easy to grow (don’t have to reserve in advance)  Easy random access  Cons  How to grow beyond index size?  Lots of seeks for sequential access How to reduce seeks for sequential access? Don’t want to pre-allocate blocks.

2) Indexed files  Pros  Easy to grow (don’t have to reserve in advance)  Easy random access  Cons  How to grow beyond index size?  Lots of seeks for sequential access How to reduce seeks for sequential access? When you allocate a new block, choose one near the one that precedes it. E.g. blocks in the same cylinder.

What about large files?  Could just assume it will be really large  Problem?  Wastes space in header if file is small  Max file size is 4GB  File block is 4KB  1 million pointers  4 MB header for 4 byte pointers  Remember most files are small  10,000 small files  40GB of headers

What about large files?  Could use a larger block size  Problem?  Internal fragmentation (most files are small)  Solution  Use a more sophisticated data structure

3) Multi-level indexed files  Think of indexed files as a shallow tree  Instead could have a multi-level tree  Level 1 points to level 2 nodes  Level 2 points to level 3 nodes  Gives us big files without wasteful headers

3) Multi-level indexed files Level 1 Level 2 Level 3 (data) How many access to read one block of data? 3 (one for each level)

3) Multi-level indexed files Level 1 Level 2 Level 3 (data) How to improve performance? Caching.

3) Multi-level indexed files  To reduce number of disk accesses  Cache level 1 and level 2 nodes  Often a useful combination  Indirection for flexibility  Caching to speed up indirection  Can cache lots of small pointers  Where else have we seen this strategy?  DNS

3) Multi-level indexed files Level 1 Level 2 Level 3 What about small files (i.e. most files)?

3) Multi-level indexed files Use a non-uniform tree

3) Multi-level indexed files  Pros  Simple  Files can easily expand  Small files don’t pay the full overhead  Cons  Large files need lots of indirect blocks (slow)  Could have lots of seeks for sequential access

Virtual/physical interfaces Hardware OS Applications

Multiple updates and reliability  Reliability is only an issue in file systems  Don’t care about losing address space after crash  Your files shouldn’t disappear after a crash  Files should be permanent  Multi-step updates cause problems  Can crash in the middle

Multi-step updates  Transfer $100 from Melissa’s account to mine 1.Deduct $100 from Melissa’s account 2.Add $100 to my account  Crash between 1 and 2, we lose $100

Multiple updates and reliability  This is a well known, undergrad OS-level problem  No modern OS would make this mistake, right?  Video evidence suggests otherwise  Directory with 3 files  Want to move them to external drive  Drive “fails” during move  Don’t want to lose data due to failure  Roll film …

The lesson? 1.Building an OS is hard. 2.OSes are software, not religions (one is not morally superior to another) 3.“Fanboys are stupid, but you are not.” (Anil Dash, dashes.com/anil)