More on Disks and FilesCS-502 Fall 20061 More on Disks and File Systems CS-502 Operating Systems Fall 2006 (Slides include materials from Operating System.

Slides:



Advertisements
Similar presentations
More on File Management
Advertisements

Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
CS-3013 & CS-502, Summer 2006 More on File Systems1 More on Disks and File Systems CS-3013 & CS-502 Operating Systems.
2P13 Week 11. A+ Guide to Managing and Maintaining your PC, 6e2 RAID Controllers Redundant Array of Independent (or Inexpensive) Disks Level 0 -- Striped.
REDUNDANT ARRAY OF INEXPENSIVE DISCS RAID. What is RAID ? RAID is an acronym for Redundant Array of Independent Drives (or Disks), also known as Redundant.
Allocation Methods - Contiguous
Chapter 11: File System Implementation
File System Implementation
File System Implementation
Other Disk Details. 2 Disk Formatting After manufacturing disk has no information –Is stack of platters coated with magnetizable metal oxide Before use,
More on Disks and File Systems 1 CS502 Spring 2006 More on Disks and File Systems CS-502 Operating Systems Spring 2006.
CS 333 Introduction to Operating Systems Class 18 - File System Performance Jonathan Walpole Computer Science Portland State University.
Cse Feb-001 CSE 451 Section February 24, 2000 Project 3 – VM.
Disks, RAIDs, and Stable Storage CS-3013 A-term Disks, RAIDs, and Stable Storage CS-3013, Operating Systems A-term 2009 (Slides include materials.
More on FilesCS-4513, D-Term More on File Systems CS-4513 Distributed Computing Systems (Slides include materials from Operating System Concepts,
File System Implementation
RAID Systems CS Introduction to Operating Systems.
The Design and Implementation of a Log-Structured File System Presented by Carl Yao.
Storage System: RAID Questions answered in this lecture: What is RAID? How does one trade-off between: performance, capacity, and reliability? What is.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 6 – RAID ©Manuel Rodriguez.
Chapter 6 RAID. Chapter 6 — Storage and Other I/O Topics — 2 RAID Redundant Array of Inexpensive (Independent) Disks Use multiple smaller disks (c.f.
More on File SystemsCS-502 Fall More on File Systems CS-502, Operating Systems Fall 2007 (Slides include materials from Operating System Concepts,
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Storage Systems.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Lecture 9 of Advanced Databases Storage and File Structure (Part II) Instructor: Mr.Ahmed Al Astal.
Redundant Array of Inexpensive Disks aka Redundant Array of Independent Disks (RAID) Modified from CCT slides.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
File System Implementations CS-502 Fall File System Implementations CS-502, Operating Systems Fall 2007 (Slides include materials from Operating.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
CE Operating Systems Lecture 20 Disk I/O. Overview of lecture In this lecture we will look at: Disk Structure Disk Scheduling Disk Management Swap-Space.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials– 8 th Edition Chapter 10: File System Implementation.
26-Oct-15CSE 542: Operating Systems1 File system trace papers The Design and Implementation of a Log- Structured File System. M. Rosenblum, and J.K. Ousterhout.
1 File Systems: Consistency Issues. 2 File Systems: Consistency Issues File systems maintains many data structures  Free list/bit vector  Directories.
"1"1 Introduction to Managing Data " Describe problems associated with managing large numbers of disks " List requirements for easily managing large amounts.
CS 153 Design of Operating Systems Spring 2015 Lecture 22: File system optimizations.
Free Space Management.
Page 111/15/2015 CSE 30341: Operating Systems Principles Chapter 11: File System Implementation  Overview  Allocation methods: Contiguous, Linked, Indexed,
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
File System Implementation
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Implementation.
12.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Chapter 12: File System Implementation Chapter 12: File System Implementation.
Chapter 11: File System Implementation Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 11: File System Implementation Chapter.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition File System Implementation.
File System Implementations CS-502 (EMC) Fall File System Implementations CS-502, Operating Systems Fall 2009 (EMC) (Slides include materials from.
CS333 Intro to Operating Systems Jonathan Walpole.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 12: File System Implementation File System Structure File System Implementation.
11.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 11.5 Free-Space Management Bit vector (n blocks) … 012n-1 bit[i] =  1  block[i]
Disks, RAIDs, and Stable Storage CS-502 (EMC) Fall Disks, RAIDs, and Stable Storage CS-502, Operating Systems Fall 2009 (EMC) (Slides include materials.
Lecture 20 FSCK & Journaling. FFS Review A few contributions: hybrid block size groups smart allocation.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Computer Science Lecture 19, page 1 CS677: Distributed OS Last Class: Fault tolerance Reliable communication –One-one communication –One-many communication.
Silberschatz, Galvin and Gagne ©2011 Operating System Concepts Essentials – 8 th Edition Chapter 3: Windows7 Part 3.
CS Introduction to Operating Systems
Jonathan Walpole Computer Science Portland State University
Chapter 11: File System Implementation
Chapter 12: File System Implementation
Journaling File Systems
Implementation of File Systems
File System Implementations
Overview Continuation from Monday (File system implementation)
Outline Allocation Free space management Memory mapped files
Printed on Monday, December 31, 2018 at 2:03 PM.
UNIT IV RAID.
Overview: File system implementation (cont)
Presentation transcript:

More on Disks and FilesCS-502 Fall More on Disks and File Systems CS-502 Operating Systems Fall 2006 (Slides include materials from Operating System Concepts, 7 th ed., by Silbershatz, Galvin, & Gagne and from Modern Operating Systems, 2 nd ed., by Tanenbaum)

More on Disks and FilesCS-502 Fall Additional Topics Mounting a file system Mapping files to virtual memory RAID – Redundant Array of Inexpensive Disks Stable Storage Log Structured File Systems Linux Virtual File System

More on Disks and FilesCS-502 Fall Summary of Reading Assignments in Silbershatz Disks (general) – §12.1 to 12.6 File systems (general) – Chapter 11 Ignore §11.9, for now! RAID – §12.7 Stable Storage – §12.8 Log-structured File System – §11.8 & §6.9

More on Disks and FilesCS-502 Fall Mounting mount –t type device pathname Attach device (which contains a file system of type type) to the directory at pathname File system implementation for type gets loaded and connected to the device Anything previously below pathname becomes hidden until the device is un-mounted again The root of the file system on device is now accessed as pathname E.g., mount –t iso9660 /dev/cdrom /myCD

More on Disks and FilesCS-502 Fall Mounting (continued) OS automatically mount devices in its mount table at initialization time /etc/fstab in Linux Type may be implicit in device Users or applications may mount devices at run time, explicitly or implicitly — e.g., Insert a floppy disk Plug in a USB flash drive

More on Disks and FilesCS-502 Fall Linux Virtual File System (VFS) A generic file system interface provided by the kernel Common object framework –superblock: a specific, mounted file system –i-node object: a specific file in storage –d-entry object: a directory entry –file object: an open file associated with a process

More on Disks and FilesCS-502 Fall Linux Virtual File System (continued) VFS operations –super_operations: read_inode, sync_fs, etc. –inode_operations: create, link, etc. –d_entry_operations: d_compare, d_delete, etc. –file_operations: read, write, seek, etc.

More on Disks and FilesCS-502 Fall Linux Virtual File System (continued) Individual file system implementations conform to this architecture. May be linked to kernel or loaded as modules Linux supports over 50 file systems in official kernel E.g., minix, ext, ext2, ext3, iso9660, msdos, nfs, smb, …

More on Disks and FilesCS-502 Fall Linux Virtual File System (continued) A special file type — proc –Mounted as /proc –Provides access to kernel internal data structures as if those structures were files! –E.g., /proc/dmesg There are several other special file types –Vary from one version/vendor to another –See Silbershatz, § –Love, Linux Kernel Development, Chapter 12 –SUSE Linux Administrator Guide, Chapter 20

More on Disks and FilesCS-502 Fall Questions?

More on Disks and FilesCS-502 Fall Mapping files to Virtual Memory Instead of “reading” from disk into virtual memory, why not simply use file as the swapping storage for certain VM pages? Called mapping Page tables in kernel point to disk blocks of the file

More on Disks and FilesCS-502 Fall Memory-Mapped Files Memory-mapped file I/O allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory A file is initially “read” using demand paging. A page- sized portion of the file is read from the file system into a physical page. Subsequent reads/writes to/from the file are treated as ordinary memory accesses. Simplifies file access by allowing application to simple access memory rather than be forced to use read() & write() calls to file system.

More on Disks and FilesCS-502 Fall Memory-Mapped Files (continued) A tantalizingly attractive notion, but … Cannot use C/C++ pointers within mapped data structure Corrupted data structures likely to persist in file Recovery after a crash is more difficult Don’t really save anything in terms of Programming energy Thought processes Storage space & efficiency

More on Disks and FilesCS-502 Fall Memory-Mapped Files (continued) Nevertheless, the idea has its uses 1.Simpler implementation of file operations –read(), write() are memory-to-memory operations –seek() is simply changing a pointer, etc… –Called memory-mapped I/O 2.Shared Virtual Memory among processes

More on Disks and FilesCS-502 Fall Shared Virtual Memory

More on Disks and FilesCS-502 Fall Shared Virtual Memory (continued) Supported in –Windows XP –Apollo DOMAIN –Linux?? Synchronization is the responsibility of the sharing applications –OS retains no knowledge –Few (if any) synchronization primitives between processes in separate address spaces

More on Disks and FilesCS-502 Fall Questions?

More on Disks and FilesCS-502 Fall Problem Question:– –If mean time to failure of a disk drive is 100,000 hours, –and if your system has 100 identical disks, –what is mean time between drive replacement? Answer:– –1000 hours (i.e., days  6 weeks) I.e.:– –You lose 1% of your data every 6 weeks! But don’t worry – you can restore most of it from backup!

More on Disks and FilesCS-502 Fall Can we do better? Yes, mirrored –Write every block twice, on two separate disks –Mean time between simultaneous failure of both disks is 57,000 years Can we do even better? –E.g., use fewer extra disks? –E.g., get more performance?

More on Disks and FilesCS-502 Fall RAID – Redundant Array of Inexpensive Disks Distribute a file system intelligently across multiple disks to –Maintain high reliability and availability –Enable fast recovery from failure –Increase performance

More on Disks and FilesCS-502 Fall “Levels” of RAID Level 0 – non-redundant striping of blocks across disk Level 1 – simple mirroring Level 2 – striping of bytes or bits with ECC Level 3 – Level 2 with parity, not ECC Level 4 – Level 0 with parity block Level 5 – Level 4 with distributed parity blocks

More on Disks and FilesCS-502 Fall RAID Level 0 – Simple Striping Each stripe is one or a group of contiguous blocks Block/group i is on disk (i mod n) Advantage –Read/write n blocks in parallel; n times bandwidth Disadvantage –No redundancy at all. System MBTF is 1/n disk MBTF! stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3

More on Disks and FilesCS-502 Fall RAID Level 1– Striping and Mirroring Each stripe is written twice Two separate, identical disks Block/group i is on disks (i mod 2n) & (i+n mod 2n) Advantages –Read/write n blocks in parallel; n times bandwidth –Redundancy: System MBTF = (Disk MBTF) 2 at twice the cost –Failed disk can be replaced by copying Disadvantage –A lot of extra disks for much more reliability than we need stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3 stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3

More on Disks and FilesCS-502 Fall RAID Levels 2 & 3 Bit- or byte-level striping Requires synchronized disks Highly impractical Requires fancy electronics For ECC calculations Not used; academic interest only See Silbershatz, § (pp )

More on Disks and FilesCS-502 Fall Observation When a disk or stripe is read incorrectly, we know which one failed! Conclusion: –A simple parity disk can provide very high reliability (unlike simple parity in memory)

More on Disks and FilesCS-502 Fall RAID Level 4 – Parity Disk parity 0-3 = stripe 0 xor stripe 1 xor stripe 2 xor stripe 3 n stripes plus parity are written/read in parallel If any disk/stripe fails, it can be reconstructed from others –E.g., stripe 1 = stripe 0 xor stripe 2 xor stripe 3 xor parity 0-3 Advantages –n times read bandwidth –System MBTF = (Disk MBTF) 2 at 1/n additional cost –Failed disk can be reconstructed “on-the-fly” (hot swap) –Hot expansion: simply add n + 1 disks all initialized to zeros However –Writing requires read-modify-write of parity stripe  only 1x write bandwidth. stripe 8 stripe 4 stripe 0 stripe 9 stripe 5 stripe 1 stripe 10 stripe 6 stripe 2 stripe 11 stripe 7 stripe 3 parity 8-11 parity 4-7 parity 0-3

More on Disks and FilesCS-502 Fall RAID Level 5 – Distributed Parity Parity calculation is same as RAID Level 4 Advantages & Disadvantages – Same as RAID Level 4 Additional advantages –avoids beating up on parity disk –Some writes in parallel Writing individual stripes (RAID 4 & 5) –Read existing stripe and existing parity –Recompute parity –Write new stripe and new parity stripe 12 stripe 8 stripe 4 stripe 0 parity stripe 9 stripe 5 stripe 1 stripe 13 parity 8-11 stripe 6 stripe 2 stripe 14 stripe 10 parity 4-7 stripe 3 stripe 15 stripe 11 stripe 7 parity 0-3

More on Disks and FilesCS-502 Fall RAID 4 & 5 Very popular in data centers –Corporate and academic servers Built-in support in Windows XP and Linux –Connect a group of disks to fast SCSI port (320 MB/sec bandwidth) –OS RAID support does the rest!

More on Disks and FilesCS-502 Fall New Topic

More on Disks and FilesCS-502 Fall Incomplete Operations Problem – how to protect against disk write operations that don’t finish –Power or CPU failure in the middle of a block –Related series of writes interrupted before all are completed Examples: –Database update of charge and credit –RAID 1, 4, 5 failure between redundant writes

More on Disks and FilesCS-502 Fall Solution (part 1) – Stable Storage Write everything twice to separate disks Be sure 1 st write does not invalidate previous 2 nd copy RAID 1 is okay; RAID 4/5 not okay! Read blocks back to validate; then report completion Reading both copies If 1 st copy okay, use it – i.e., newest value If 2 nd copy different, update it with 1 st copy If 1 st copy is bad; use 2 nd copy – i.e., old value

More on Disks and FilesCS-502 Fall Stable Storage (continued) Crash recovery Scan disks, compare corresponding blocks If one is bad, replace with good one If both good but different, replace 2 nd with 1 st copy Result:– If 1 st block is good, it contains latest value If not, 2 nd block still contains previous value An abstraction of an atomic disk write of a single block Uninterruptible by power failure, etc.

More on Disks and FilesCS-502 Fall What about more complex disk operations? E.g., File create operation involves Allocating free blocks Constructing and writing i-node –Possibly multiple i-node blocks Reading and updating directory What if system crashes with the sequence only partly completed? Answer: inconsistent data structures on disk

More on Disks and FilesCS-502 Fall Solution (Part 2) – Log-Structured File System Make changes to cached copies in memory Collect together all changed blocks Including i-nodes and directory blocks Write to log file (aka journal file) A circular buffer on disk Fast, contiguous write Update log file pointer in stable storage Offline: Play back log file to actually update directories, i-nodes, free list, etc. Update playback pointer in stable storage

More on Disks and FilesCS-502 Fall Transaction Data Base Systems Similar techniques –Every transaction is recorded in log before recording on disk –Stable storage techniques for managing log pointers –One log exist is confirmed, disk can be updated in place –After crash, replay log to redo disk operations

More on Disks and FilesCS-502 Fall Berkeley LFS — a slight variation Everything is written to log i-nodes point to updated blocks in log i-node cache in memory updated whenever i-node is written Cleaner daemon follows behind to compact log Advantages: –LFS is always consistent –LFS performance Much better than Unix file system for small writes At least as good for reads and large writes Tanenbaum, §6.3.8, pp Rosenblum & Ousterhout, Log-structured File System (pdf)Rosenblum & Ousterhout, Log-structured File System (pdf) Note: not same as Linux LFS (large file system)

More on Disks and FilesCS-502 Fall Example i-node modified blocks a b c Before old i-node old blocks a b c log a b c new blocks new i-node After

More on Disks and FilesCS-502 Fall Summary of Reading Assignments in Silbershatz Disks (general) – §12.1 to 12.6 File systems (general) – Chapter 11 Ignore §11.9, for now! RAID – §12.7 Stable Storage – §12.8 Log-structured File System – §11.8 & §6.9