More on Disks and File Systems 1 CS502 Spring 2006 More on Disks and File Systems CS-502 Operating Systems Spring 2006.

Slides:



Advertisements
Similar presentations
Chapter 4 : File Systems What is a file system?
Advertisements

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Disks and RAID.
File Systems.
CS-3013 & CS-502, Summer 2006 More on File Systems1 More on Disks and File Systems CS-3013 & CS-502 Operating Systems.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
File Management Lecture 3.
Chapter 11: File System Implementation
File System Implementation
File System Implementation
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
File Systems Implementation
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
1 Operating Systems Chapter 7-File-System File Concept Access Methods Directory Structure Protection File-System Structure Allocation Methods Free-Space.
Ceng Operating Systems
Cse Feb-001 CSE 451 Section February 24, 2000 Project 3 – VM.
File System Structure §File structure l Logical storage unit l Collection of related information §File system resides on secondary storage (disks). §File.
File Systems Implementation. 2 Recap What we have covered: –User-level view of FS –Storing files: contiguous, linked list, memory table, FAT, I-nodes.
More on FilesCS-4513, D-Term More on File Systems CS-4513 Distributed Computing Systems (Slides include materials from Operating System Concepts,
More on Disks and FilesCS-502 Fall More on Disks and File Systems CS-502 Operating Systems Fall 2006 (Slides include materials from Operating System.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
File System Implementation
1 Course Outline Processes & Threads CPU Scheduling Synchronization & Deadlock Memory Management File Systems & I/O Networks, Protection and Security.
RAID Systems CS Introduction to Operating Systems.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
More on File SystemsCS-502 Fall More on File Systems CS-502, Operating Systems Fall 2007 (Slides include materials from Operating System Concepts,
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
1 Recitation 8 Disk & File System. 2 Disk Scheduling Disks are at least four orders of magnitude slower than main memory –The performance of disk I/O.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Disks Chapter 5 Thursday, April 5, Today’s Schedule Input/Output – Disks (Chapter 5.4)  Magnetic vs. Optical Disks  RAID levels and functions.
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
File Storage Organization The majority of space on a device is reserved for the storage of files. When files are created and modified physical blocks are.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
File System Implementation
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Module 4.0: File Systems File is a contiguous logical address space.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
File Structures. 2 Chapter - Objectives Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files Hashed Files Dynamic and.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
1 CS.217 Operating System By Ajarn..Sutapart Sappajak,METC,MSIT Chapter 11 File-System Implementation Slide 1 Chapter 11: File-System Implementation.
CS333 Intro to Operating Systems Jonathan Walpole.
I MPLEMENTING FILES. Contiguous Allocation:  The simplest allocation scheme is to store each file as a contiguous run of disk blocks (a 50-KB file would.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Transactions and Reliability Andy Wang Operating Systems COP 4610 / CGS 5765.
Storage and File structure COP 4720 Lecture 20 Lecture Notes.
CS399 New Beginnings Jonathan Walpole. Disk Technology & Secondary Storage Management.
Chapter 5 Record Storage and Primary File Organizations
Lecture Topics: 11/22 HW 7 File systems –block allocation Unix and NT –disk scheduling –file caches –RAID.
W4118 Operating Systems Instructor: Junfeng Yang.
CS Introduction to Operating Systems
File-System Management
Jonathan Walpole Computer Science Portland State University
File-System Implementation
Chapter 11: File System Implementation
FileSystems.
Disks and RAID.
Filesystems.
Disks.
Chapter 11: File System Implementation
CS510 Operating System Foundations
Overview Continuation from Monday (File system implementation)
Overview: File system implementation (cont)
Chapter 5 File Systems -Compiled for MCA, PU
Presentation transcript:

More on Disks and File Systems 1 CS502 Spring 2006 More on Disks and File Systems CS-502 Operating Systems Spring 2006

More on Disks and File Systems 2 CS502 Spring 2006 Review – Disks Implementer of File abstraction Storage of large amounts of data for very long times Persistence, reliability Controlled like I/O devices, but integral part of information storage subsystem Rapidly increasing capacities, dropping prices $0.5–$6.0 per gigabyte Slowly improving transfer rates, seek performance Only a factor of 5-10 in three decades!

More on Disks and File Systems 3 CS502 Spring 2006 Review – Disks (continued) Organized into cylinders, tracks, sectors Random access Any sector can be read & written independently of any other Very high bandwidth for consecutive reads or writes Seek time is (often) dominating factor in performance Bad blocks are a fact of life Most detected during formatting Some occur during operation Controller or OS must step around them Seek optimization algorithms Popular study topics, less popular in real systems Long seek queues  system is out of balance

More on Disks and File Systems 4 CS502 Spring 2006 Review – File Systems Fundamental abstraction for persistent storage Usually organized as linear array of bytes Any sequence of bytes may be read or overwritten Extreme performance demands Many small files vs. a few humongous files Fundamental ambiguity Is file the “information” or the “container” OS sees the container; users focus on information Many attributes Stored in file metadata associated with file

More on Disks and File Systems 5 CS502 Spring 2006 Review – File Systems (continued) Operations Open, Close; Read, Write, Truncate; Seek, Tell Create; Destroy Access methods Sequential Random Indexed (not used very much any more) Structure imposed by applications Databases, libraries, executable images

More on Disks and File Systems 6 CS502 Spring 2006 Review – Directories Special kind of file Tool for users to organize files Tool for system to find file containers Organization Single level, two level, hierarchical Directory operations Create, Destroy; Add entry, Remove entry Find, List, Rename; Link, Unlink Links Soft (symbolic) links in Unix, Windows Hard links in Unix (reference counted in metadata)

More on Disks and File Systems 7 CS502 Spring 2006 Review – File System Implementation Contiguous (with optional extents) Very efficient for large files (e.g., databases) Prone to space fragmentation for many small files Bad blocks must be concealed by OS or controller Linked No space fragmentation; lots of seek fragmentation Sequential access only FAT (File Allocation Table)  pseudo-random Indexed i-node (index block) points to every block of file Fast random access Scales easily from small to large No space fragmentation; lots of seek fragmentation Defragmentation Remapping linked, FAT, or indexed files to minimize seek time

More on Disks and File Systems 8 CS502 Spring 2006 Additional Topics Implementation of Directories CD-ROM devices and file systems RAID – Redundant Array of Inexpensive Disks Stable Storage Log Structured File Systems

More on Disks and File Systems 9 CS502 Spring 2006 Implementation of Directories A list of [name, information] pairs Must be scalable from very few entries to very many Name: User-friendly, variable length, any language Fast access by name Information: File metadata (itself) Pointer to file metadata block (or i-node) on disk Pointer to first & last blocks of file Pointer to extent block(s) …

More on Disks and File Systems 10 CS502 Spring 2006 Very Simple Directory Short, fixed length names Attribute & disk addresses contained in directory MS-DOS, etc. name1attributesname2attributesname3attributesname4attributes …

More on Disks and File Systems 11 CS502 Spring 2006 Simple Directory Short, fixed length names Attributes in separate blocks (e.g., i-nodes) Attribute pointers are disk addresses (or i-node numbers) Older Unix versions name1name2name3name4… i-node Data structures containing attributes

More on Disks and File Systems 12 CS502 Spring 2006 More Interesting Directory Variable length file names –Stored in heap at end Modern Unix, Windows Linear or logarithmic search for name Compaction needed after –Deletion, Rename attributes … name1 longer_na me3 very_long_n ame4 name2 …

More on Disks and File Systems 13 CS502 Spring 2006 Very Large Directories Hash-table implementation Each hash chain like a small directory with variable-length names Must be sorted for listing

More on Disks and File Systems 14 CS502 Spring 2006 File System Implementation – Free Space Management Bitmap –Very compact on disk –Expensive to search Free list –Linked list of free blocks –Only head of list needs to be cached in memory –Larger than bitmap:– Consumes 1/n of free space List grows and shrinks inversely with allocating or freeing blocks –Very fast to search and allocate

More on Disks and File Systems 15 CS502 Spring 2006 CD-ROMs See Tanenbaum, pp Audio CD –Molded polycarbonate –120 mm diameter with 15 mm hole –One single spiral track Starts in center, spirals outward 22,188 revolutions, approx 5.6 kilometers long –Constant linear velocity under read head Audio playback:– 120 cm/sec Variable speed motor:– 200 – 530 rpm ISO standard IS 10149, aka the Red Book

More on Disks and File Systems 16 CS502 Spring 2006 CD-ROM (continued) Problem for adapting to data usage –No bad block recovery capability! ISO standard for data: Yellow Book –Three levels of error-correcting schemes: – Symbol, Frame, Sector ~7200 bytes to record 2048 byte payload per sector –Mode 2: less error correction in exchange for more data rate Audio and video data –Sectors linearly numbered from center to edge Read speed –1x ~ 153,000 bytes/sec –40x ~ 5.9 megabytes/sec ISO standard for multi-media: Green Book –Interleaved audio, video, data in same sector

More on Disks and File Systems 17 CS502 Spring 2006 CD-ROM File System ISO 9660 — High Sierra Write once  contiguous file allocation Variable length directories Variable length directory entries Points to first sector of file File size and metadata stored in directory entry Variable length names Several extensions to standard for additional features

More on Disks and File Systems 18 CS502 Spring 2006 Break

More on Disks and File Systems 19 CS502 Spring 2006 Problem Question:– –If mean time to failure of a disk drive is 100,000 hours, –and if your system has 100 identical disks, –what is mean time between drive replacement? Answer:– –1000 hours (i.e., days  6 weeks) I.e.:– –You lose 1% of your data every 6 weeks! But don’t worry – you can restore most of it from backup!

More on Disks and File Systems 20 CS502 Spring 2006 Can we do better? Yes, mirrored –Write every block twice, on two separate disks –Mean time between simultaneous failure of both disks is 57,000 years Can we do even better? –E.g., use fewer extra disks? –E.g., get more performance?

More on Disks and File Systems 21 CS502 Spring 2006 RAID – Redundant Array of Inexpensive Disks Distribute a file system intelligently across multiple disks to –Maintain high reliability and availability –Enable fast recovery from failure –Increase performance

More on Disks and File Systems 22 CS502 Spring 2006 “Levels” of RAID Level 0 – non-redundant striping of blocks across disk Level 1 – simple mirroring Level 2 – striping of bytes or bits with ECC Level 3 – Level 2 with parity, not ECC Level 4 – Level 0 with parity block Level 5 – Level 4 with distributed parity blocks

More on Disks and File Systems 23 CS502 Spring 2006 RAID Level 0 – Simple Striping Each stripe is one or a group of contiguous blocks Block/group i is on disk (i mod n) Advantage –Read/write n blocks in parallel; n times bandwidth Disadvantage –No redundancy at all. System MBTF is 1/n disk MBTF! stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3

More on Disks and File Systems 24 CS502 Spring 2006 RAID Level 1– Striping and Mirroring Each stripe is written twice Two separate, identical disks Block/group i is on disks (i mod 2n) & (i+n mod 2n) Advantages –Read/write n blocks in parallel; n times bandwidth –Redundancy: System MBTF = (Disk MBTF) 2 at twice the cost –Failed disk can be replaced by copying Disadvantage –A lot of extra disks for much more reliability than we need stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3 stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3

More on Disks and File Systems 25 CS502 Spring 2006 RAID Levels 2 & 3 Bit- or byte-level striping Requires synchronized disks Highly impractical Requires fancy electronics For ECC calculations Not used; academic interest only See Silbershatz, § (pp )

More on Disks and File Systems 26 CS502 Spring 2006 Observation When a disk or stripe is read incorrectly, we know which one failed! Conclusion: –A simple parity disk can provide very high reliability (unlike simple parity in memory)

More on Disks and File Systems 27 CS502 Spring 2006 RAID Level 4 – Parity Disk parity 0-3 = stripe 0 xor stripe 1 xor stripe 2 xor stripe 3 n stripes plus parity are written/read in parallel If any disk/stripe fails, it can be reconstructed from others –E.g., stripe 1 = stripe 0 xor stripe 2 xor stripe 3 xor parity 0-3 Advantages –n times read/write bandwidth –System MBTF = (Disk MBTF) 2 at 1/n additional cost –Failed disk can be reconstructed “on-the-fly” (hot swap) –Hot expansion: simply add n + 1 disks all initialized to zeros stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3 parity 8-11 parity 4-7 parity 0-3

More on Disks and File Systems 28 CS502 Spring 2006 RAID Level 5 – Distributed Parity Parity calculation is same as RAID Level 4 Advantages & Disadvantages –Same as RAID Level 4 Additional advantage: avoids beating up on parity disk Writing individual stripes (RAID 4 & 5) –Read existing stripe and existing parity –Recompute parity –Write new stripe and new parity stripe 12 stripe 8 stripe 4 stripe 0 parity stripe 9 stripe 5 stripe 1 stripe 13 parity 8-11 stripe 6 stripe 2 stripe 14 stripe 10 parity 4-7 stripe 3 stripe 15 stripe 11 stripe 7 parity 0-3

More on Disks and File Systems 29 CS502 Spring 2006 New Topic Problem – how to protect against disk write operations that don’t complete –Power or CPU failure in the middle of a block –Related series of writes interrupted in middle Examples: –Database update of charge and credit –RAID 1, 4, 5 failure between redundant writes

More on Disks and File Systems 30 CS502 Spring 2006 Solution (part 1) – Stable Storage Write everything twice (separate disks) Be sure 1 st write does not invalidate previous 2 nd copy RAID 1 is okay; RAID 4/5 not okay! Read blocks back to validate; then report completion Reading both copies If 1 st copy okay, use it – i.e., newest value If 2 nd copy different, update it with 1 st copy If 1 st copy error; use 2 nd copy – i.e., old value

More on Disks and File Systems 31 CS502 Spring 2006 Stable Storage (continued) Crash recovery Scan disks, compare corresponding blocks If one is bad, replace with good one If both good but different, replace 2 nd with 1 st copy Result:– If 1 st block is good, it contains latest value If not, 2 nd block still contains previous value An abstraction of an atomic disk write of a single block Uninterruptible by power failure, etc.

More on Disks and File Systems 32 CS502 Spring 2006 What about more complex disk operations? E.g., File create operation involves Allocating free blocks Constructing and writing i-node –Possibly multiple i-node blocks Reading and updating directory What if system crashes with the sequence only partly completed? Answer: inconsistent data structures on disk

More on Disks and File Systems 33 CS502 Spring 2006 Solution (Part 2) – Log-Structured File System Make changes to cached copies in memory Collect together all changed blocks Write to log file A circular buffer on disk Fast, contiguous write Update log file pointer in stable storage Offline: Play back log file to actually update directories, i-nodes, free list, etc. Update playback pointer in stable storage

More on Disks and File Systems 34 CS502 Spring 2006 Transaction Data Base Systems Similar techniques –Every transaction is recorded in log before recording on disk –Stable storage techniques for managing log pointers –One log exist is confirmed, disk can be updated in place –After crash, replay log to redo disk operations

More on Disks and File Systems 35 CS502 Spring 2006 Unix LFS Tanenbaum, §6.3.8, pp Everything is written to log i-nodes point to updated blocks in log i-node cache in memory updated whenever i-node is written Cleaner daemon follows behind to compact log Advantages: –LFS is always consistent –LFS performance Much better than Unix FS for small writes At least as good for reads and large writes

More on Disks and File Systems 36 CS502 Spring 2006 Break