COS 318: Operating Systems File Layout and Directories

Slides:



Advertisements
Similar presentations
Chapter 12: File System Implementation
Advertisements

File Management.
File Systems: Fundamentals.
More on File Management
File Systems.
CS503: Operating Systems Spring 2014 General File Systems
Chapter 11: File System Implementation
File System Implementation
Disks and Files Vivek Pai Princeton University. 2 Why Files Physical reality Block oriented Physical sector #s No protection among users of the system.
File System Implementation CSCI 444/544 Operating Systems Fall 2008.
File Systems Implementation
1 Outline File Systems Implementation How disks work How to organize data (files) on disks Data structures Placement of files on disk.
1 Friday, July 07, 2006 “Vision without action is a daydream, Action without a vision is a nightmare.” - Japanese Proverb.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Chapter 12: File System Implementation
Disks and Files Vivek Pai Princeton University. 2 Gedankyou Imagine the following: A disk scheduling policy says “handle the request that is closest to.
File System Design David E. Culler CS162 – Operating Systems and Systems Programming Lecture 24 October 24, 2014 Reading: A&D 13.3 HW 4 due 10/27 Proj.
File Systems. Main Points File layout Directory layout.
File Systems (1). Readings r Silbershatz et al: 10.1,10.2,
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems David Goldschmidt, Ph.D.
File Systems and Disk Management. File system Interface between applications and the mass storage/devices Provide abstraction for the mass storage and.
COSC 3407: Operating Systems Lecture 18: Disk Management and File System.
File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy.
1Fall 2008, Chapter 11 Disk Hardware Arm can move in and out Read / write head can access a ring of data as the disk rotates Disk consists of one or more.
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Files CS Spring Overview Example: FAT File System File Organization File System Organization –File Directories and File Sharing –Record Blocking.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
CSC 322 Operating Systems Concepts Lecture - 20: by Ahmed Mumtaz Mustehsan Special Thanks To: Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
CSCI-375 Operating Systems Lecture Note: Many slides and/or pictures in the following are adapted from: slides ©2005 Silberschatz, Galvin, and Gagne Some.
CS333 Intro to Operating Systems Jonathan Walpole.
File System Implementation
Module 4.0: File Systems File is a contiguous logical address space.
File Systems Security File Systems Implementation.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
Disk & File System Management Disk Allocation Free Space Management Directory Structure Naming Disk Scheduling Protection CSE 331 Operating Systems Design.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 11: File System Implementation.
Lecture 10 Page 1 CS 111 Summer 2013 File Systems Control Structures A file is a named collection of information Primary roles of file system: – To store.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems File systems.
CS 3204 Operating Systems Godmar Back Lecture 21.
Operating Systems 1 K. Salah Module 4.0: File Systems  File is a contiguous logical address space (of related records)  Access Methods  Directory Structure.
File Systems.  Issues for OS  Organize files  Directories structure  File types based on different accesses  Sequential, indexed sequential, indexed.
Operating Systems Files, Directory and File Systems Operating Systems Files, Directory and File Systems.
Part III Storage Management
1 Lecture 15: File System Interface  file system interface reasons for delegating storage management to OS file definition and operations on a file file.
W4118 Operating Systems Instructor: Junfeng Yang.
COMP 3500 Introduction to Operating Systems Directory Structures Block Management Dr. Xiao Qin Auburn University
The need for File Systems Need to store data and programs in files Must be able to store lots of data Must be nonvolatile and survive crashes and power.
File Systems and Disk Management
Chapter 11: File System Implementation
File System Structure How do I organize a disk into a file system?
Filesystems.
The UNIX File System Jerry Breecher Contains sections on:
Chapter 11: File System Implementation
File Systems and Disk Management
CS510 Operating System Foundations
File Systems: Fundamentals.
Directory Structure A collection of nodes containing information about all files Directory Files F 1 F 2 F 3 F 4 F n Both the directory structure and the.
Introduction to Operating Systems
File Systems and Disk Management
File Systems and Disk Management
File Systems and Disk Management
File System Implementation
File Systems and Disk Management
Chapter 14: File-System Implementation
File Systems and Disk Management
File Systems and Disk Management
SE350: Operating Systems Lecture 12: File Systems.
Chapter 14: File System Implementation
The File Manager Implementation issues
File Systems CSE 2431: Introduction to Operating Systems
Presentation transcript:

COS 318: Operating Systems File Layout and Directories

Topics File system structure Disk allocation and i-nodes Directory and link implementations Physical layout for performance

File System Components Naming File and directory naming Local and remote operations File access Implement read/write and other functionalities Buffer cache Reduce client/server disk I/Os Disk allocation File data layout Mapping files to disk blocks Management Tools for system administrators to manage file systems File naming File access Management Buffer cache Disk allocation Volume manager

Steps to Open A File . File system info Process control block File name lookup and authenticate Copy the file descriptors into the in-memory data structure, if it is not in yet Create an entry in the open file table (system wide) if there isn’t one Create an entry in PCB Link up the data structures Return a pointer to user File system info Process control block Open file table (system-wide) File descriptors (Metadata) File descriptors Directories File data Open file pointer array .

File Read and Write Read 10 bytes from a file starting at byte 2? seek byte 2 fetch the block read 10 bytes Write 10 bytes to a file starting at byte 2? write 10 bytes in memory write out the block

Disk Layout Boot block Super-block defines a file system File descriptors (i-node in Unix) File data blocks Boot block Code to bootstrap the operating system Super-block defines a file system Size of the file system Size of the file descriptor area Free list pointer, or pointer to bitmap Location of the file descriptor of the root directory Other meta-data such as permission and various times Kernel keeps in main memory, and is replicated on disk too File descriptors Each describes a file File data blocks Data for the files, the largest portion on disk

Data Structures for Disk Allocation The goal is to manage the allocation of a volume A file header for each file Disk blocks associated with each file A data structure to represent free space on disk Bit map that uses 1 bit per block (sector) Linked list that chains free blocks together … 11111111111111111000000000000000 00000111111110000000000111111111 … 11000001111000111100000000000000 Free link link addr … addr size size

Contiguous Allocation Request in advance for the size of the file Search bit map or linked list to locate a space File header First block in file Number of blocks Pros Fast sequential access Easy random access Cons External fragmentation (what if file C needs 3 blocks) Hard to grow files: may have to move (large) files on disk May need compaction File A File B File C

Linked Files (Alto) File header points to 1st block on disk A block points to the next Pros Can grow files dynamically Free list is similar to a file No external fragmentation or need to move files Cons Random access: horrible Even sequential access needs one seek per block Unreliable: losing a block means losing the rest File header . . . null

File Allocation Table (FAT) foo 217 Approach Table of “next pointers”, indexed by block Instead of pointer stored with block Directory entry points to 1st block of file Pros No need to traverse list to find a block Cache FAT table and traverse in memory Cons FAT table takes lots of space for large disk Hard to fit in memory; so may need seeks Pointers for all files on whole disk are interspersed in FAT table Need full table in memory, even for one file Solution: indexed files Keep block lists for different files together, and in different parts of disk 217 619 399 EOF 619 399 FAT

Single-Level Indexed Files A user declares max size A file header holds an array of pointers to point to disk blocks Pros Can grow up to a limit Random access is fast Cons Clumsy to grow beyond the limit Still lots of seeks Disk blocks File header

DEMOS (Cray-1) Idea Approach Pros & cons . . . Using contiguous allocation Allow non-contiguous Approach 10 (base,size) pointers Indirect for big files Pros & cons Can grow (max 10GB) fragmentation find free blocks (base,size) . data (base,size) . . data

Multi-Level Indexed Files (Unix) 13 Pointers in a header 10 direct pointers 11: 1-level indirect 12: 2-level indirect 13: 3-level indirect Pros & Cons In favor of small files Can grow Limit is 16G and lots of seek data data 1 2 . . data 11 12 13 . . data . . . data

What’s in Original Unix i-node? Mode: file type, protection bits, setuid, setgid bits Link count: number of directory entries pointing to this Uid: uid of the file owner Gid: gid of the file owner File size Times (access, modify, change) No filename?? 10 pointers to data blocks Single indirect pointer Double indirect pointer Triple indirect pointer

Extents Instead of using a fix-sized block, use a number of blocks XFS uses 8Kbyte block Max extent size is 2M blocks Index nodes need to have Block offset Length Starting block Is this approach better than the Unix i-node approach? Block offset length Starting block . . .

Naming Text name Index (i-node number) Icon Need to map it to index Ask users to specify i-node number Icon Need to map it to index or map it to text then to index

Directory Organization Examples Flat (CP/M) All files are in one directory Hierarchical (Unix) /u/cos318/foo Directory is stored in a file containing (name, i-node) pairs The name can be either a file or a directory Hierarchical (Windows) C:\windows\temp\foo Use the extension to indicate whether the entry is a directory

Mapping File Names to i-nodes Create/delete Create/delete a directory Open/close Open/close a directory for read and write Should this be the same or different from file open/close? Link/unlink Link/unlink a file Rename Rename the directory

Linear List Method Pros Cons /u/jps/ foo bar … veryLongFileName <FileName, i-node> pairs are linearly stored in a file Create a file Append <FileName, i-node> Delete a file Search for FileName Remove its pair from the directory Compact by moving the rest Pros Space efficient Cons Linear search Need to deal with fragmentation /u/jps/ foo bar … veryLongFileName <foo,1234> <bar, 1235> … <very LongFileName, 4567>

Tree Data Structure Method Pros Cons … Store <fileName, i-node> a tree data structure such as B-tree Create/delete/search in the tree data structure Pros Good for a large number of files Cons Inefficient for a small number of files More space Complex …

Hashing Method Pros Cons … Use a hash table to map FileName to i-node Space for name and metadata is variable sized Create/delete will trigger space allocation and free Pros Fast searching and relatively simple Cons Not as efficient as trees for very large directory (wasting space for the hash table) foo 1234 bar 1235 … foobar 4567

Disk I/Os to Read/Write A File Disk I/Os to access a byte of /u/cos318/foo Read the i-node and first data block of “/” Read the i-node and first data block of “u” Read the i-node and first data block of “cos318” Read the i-node and first data block of “foo” Disk I/Os to write a file Read the i-node of the directory and the directory file. Read or create the i-node of the file Read or create the file itself Write back the directory and the file Too many I/Os to traverse the directory Solution is to use Current Working Directory

Links Symbolic (soft) links Hard links Why symbolic or hard links? A symbolic link is just the name of the file Original owner still owns the file, really deleted on rm by owner Use a new i-node for the symbolic link ln –s source target Hard links A link to a file with the same i-node ln source target Delete may or may not remove the target depending on whether it is the last one (link reference count) Why symbolic or hard links? How would you implement them? Symbolic links can be to any file on the network, since it’s just a name Symbolic links are slow Hard links may have accouting issues: original owner may be charged even after they’ve deleted the file Symbolic links are kept as the name, in a file. Hard links via reference counts

Original Unix File System Simple disk layout Block size is sector size (512 bytes) i-nodes are on outermost cylinders Data blocks are on inner cylinders Use linked list for free blocks Issues Index is large Fixed max number of files i-nodes far from data blocks i-nodes for directory not close together Consecutive blocks can be anywhere Poor bandwidth (20Kbytes/sec even for sequential access!) i-node array

BSD FFS (Fast File System) Use a larger block size: 4KB or 8KB Allow large blocks to be chopped into fragments Used for little files and pieces at the ends of files Use bitmap instead of a free list Try to allocate contiguously 10% reserved disk space foo bar

FFS Disk Layout i-node subarray i-nodes are grouped together A portion of the i-node array on each cylinder Do you ever read i-nodes without reading any file blocks? 4 times more often than reading together examples: ls, make Overcome rotational delays Skip sector positioning to avoid the context switch delay Read ahead: read next block right after the first i-node subarray

What Has FFS Achieved? Performance improvements 20-40% of disk bandwidth for large files (10-20x original) Better small file performance (why?) We can still do a lot better Extent based instead of block based Use a pointer and size for all contiguous blocks (XFS, Veritas file system, etc) Synchronous metadata writes hurt small file performance Asynchronous writes with certain ordering (“soft updates”) Logging (talk about this later) Play with semantics (/tmp file systems)

Summary File system structure File metadata Directories Links Boot block, super block, file metadata, file data File metadata Consider efficiency, space and fragmentation Directories Consider the number of files Links Soft vs. hard Physical layout Where to put metadata and data