Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE 153 Design of Operating Systems Spring 2016 Lecture 13: File Systems.

Similar presentations


Presentation on theme: "CSE 153 Design of Operating Systems Spring 2016 Lecture 13: File Systems."— Presentation transcript:

1 CSE 153 Design of Operating Systems Spring 2016 Lecture 13: File Systems

2 OS Abstractions 2 Operating System Hardware Applications CPUDiskRAM ProcessFile systemVirtual memory CSE 153 – Lecture 13 – File Systems

3 3 File Systems l First we’ll discuss properties of physical disks u Structure u Performance u Scheduling l Then we’ll discuss how we build file systems on them u Files u Directories u Sharing u Protection u File System Layouts u File Buffer Cache u Read Ahead

4 CSE 153 – Lecture 13 – File Systems4 Disks and the OS l Disks are messy physical devices: u Errors, bad blocks, missed seeks, etc. l OS’s job is to hide this mess from higher level software u Low-level device control (initiate a disk read, etc.) u Higher-level abstractions (files, databases, etc.)

5 What’s Inside A Disk Drive? Spindle Arm Actuator Platters Electronics (including a processor and memory!) SCSI connector Image courtesy of Seagate Technology

6 6 Physical Disk Structure l Disk components u Platters u Surfaces u Tracks u Sectors u Cylinders u Arm u Heads Arm Heads Platter Surface Cylinder Track Sector (512 bytes) Arm

7 Disk Geometry l Disks consist of platters, each with two surfaces. l Each surface consists of concentric rings called tracks. l Each track consists of sectors separated by gaps. Spindle Surface Tracks Track k Sectors Gaps

8 Disk Geometry (Muliple- Platter View) l Aligned tracks form a cylinder. Surface 0 Surface 1 Surface 2 Surface 3 Surface 4 Surface 5 Cylinder k Spindle Platter 0 Platter 1 Platter 2

9 Disk Capacity l Capacity: maximum number of bits that can be stored. u Vendors express capacity in units of gigabytes (GB), where 1 GB = 10 9 Bytes (Lawsuit pending! Claims deceptive advertising). l Capacity is determined by these technology factors: u Recording density (bits/in): number of bits that can be squeezed into a 1 inch segment of a track. u Track density (tracks/in): number of tracks that can be squeezed into a 1 inch radial segment. u Areal density (bits/in2): product of recording and track density. l Modern disks partition tracks into disjoint subsets called recording zones u Each track in a zone has the same number of sectors, determined by the circumference of innermost track. u Each zone has a different number of sectors/track

10 Computing Disk Capacity Computing Disk Capacity Capacity = (# bytes/sector) x (avg. # sectors/track) x (# tracks/surface) x (# surfaces/platter) x (# platters/disk) Example: u 512 bytes/sector u 300 sectors/track (on average) u 20,000 tracks/surface u 2 surfaces/platter u 5 platters/disk Capacity = 512 x 300 x 20000 x 2 x 5 = 30,720,000,000 = 30.72 GB

11 Disk Operation (Single-Platter View) The disk surface spins at a fixed rotational rate By moving radially, the arm can position the read/write head over any track. The read/write head is attached to the end of the arm and flies over the disk surface on a thin cushion of air. spindle

12 Disk Operation (Multi-Platter View) Arm Read/write heads move in unison from cylinder to cylinder Spindle

13 Tracks divided into sectors Disk Structure - top view of single platter Surface organized into tracks

14 Disk Access Head in position above a track

15 Disk Access Rotation is counter-clockwise

16 Disk Access – Read About to read blue sector

17 Disk Access – Read After BLUE read After reading blue sector

18 Disk Access – Read After BLUE read Red request scheduled next

19 Disk Access – Seek After BLUE read Seek for RED Seek to red’s track

20 Disk Access – Rotational Latency After BLUE read Seek for REDRotational latency Wait for red sector to rotate around

21 Disk Access – Read After BLUE read Seek for REDRotational latencyAfter RED read Complete read of red

22 Disk Access – Service Time Components After BLUE read Seek for REDRotational latencyAfter RED read Data transferSeekRotational latency Data transfer

23 Disk Access Time l Average time to access some target sector approximated by : u Taccess = Tavg seek + Tavg rotation + Tavg transfer l Seek time (Tavg seek) u Time to position heads over cylinder containing target sector. u Typical Tavg seek is 3—9 ms l Rotational latency (Tavg rotation) u Time waiting for first bit of target sector to pass under r/w head. u Tavg rotation = 1/2 x 1/RPMs x 60 sec/1 min u Typical Tavg rotation = 7200 RPMs l Transfer time (Tavg transfer) u Time to read the bits in the target sector. u Tavg transfer = 1/RPM x 1/(avg # sectors/track) x 60 secs/1 min.

24 Disk Access Time Example l Given: u Rotational rate = 7,200 RPM u Average seek time = 9 ms. u Avg # sectors/track = 400. l Derived: u Tavg rotation = 1/2 x (60 secs/7200 RPM) x 1000 ms/sec = 4 ms. u Tavg transfer = 60/7200 RPM x 1/400 secs/track x 1000 ms/sec = 0.02 ms u Taccess = 9 ms + 4 ms + 0.02 ms l Important points: u Access time dominated by seek time and rotational latency. u First bit in a sector is the most expensive, the rest are free. u SRAM access time is about 4 ns/doubleword, DRAM about 60 ns »Disk is about 40,000 times slower than SRAM, »2,500 times slower then DRAM.

25 Logical Disk Blocks l Modern disks present a simpler abstract view of the complex sector geometry: u The set of available sectors is modeled as a sequence of b- sized logical blocks (0, 1, 2,...) l Mapping between logical blocks and actual (physical) sectors u Maintained by hardware/firmware device called disk controller. u Converts requests for logical blocks into (surface,track,sector) triples. l Allows controller to set aside spare cylinders for each zone. u Accounts for the difference in “formatted capacity” and “maximum capacity”.

26 I/O Bus Main memory I/O bridge Bus interface ALU Register file CPU chip System busMemory bus Disk controller Graphics adapter USB controller MouseKeyboardMonitor Disk I/O bus Expansion slots for other devices such as network adapters.

27 Reading a Disk Sector (1) Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller mouse keyboard Monitor Disk I/O bus Bus interface CPU initiates a disk read by writing a command, logical block number, and destination memory address to a port (address) associated with disk controller.

28 Reading a Disk Sector (2) Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller MouseKeyboardMonitor Disk I/O bus Bus interface Disk controller reads the sector and performs a direct memory access (DMA) transfer into main memory.

29 Reading a Disk Sector (3) Main memory ALU Register file CPU chip Disk controller Graphics adapter USB controller MouseKeyboardMonitor Disk I/O bus Bus interface When the DMA transfer completes, the disk controller notifies the CPU with an interrupt (i.e., asserts a special “interrupt” pin on the CPU)

30 CSE 153 – Lecture 13 – File Systems30 Disk Interaction l Specifying disk requests requires a lot of info: u Cylinder #, surface #, track #, sector #, transfer size… l Older disks required the OS to specify all of this u The OS needed to know all disk parameters l Modern disks are more complicated u Not all sectors are the same size, sectors are remapped, etc. l Current disks provide a higher-level interface (SCSI) u The disk exports its data as a logical array of blocks [0…N] »Disk maps logical blocks to cylinder/surface/track/sector u Only need to specify the logical block # to read/write u But now the disk parameters are hidden from the OS

31 CSE 153 – Lecture 13 – File Systems31 Disks Heterogeneity l Seagate Barracuda 3.5" (workstation) u capacity: 250 - 750 GB u rotational speed: 7,200 RPM u sequential read performance: 78 MB/s (outer) - 44 MB/s (inner) u seek time (average): 8.1 ms l Seagate Cheetah 3.5" (server) u capacity: 73 - 300 GB u rotational speed: 15,000 RPM u sequential read performance: 135 MB/s (outer) - 82 MB/s (inner) u seek time (average): 3.8 ms l Seagate Savvio 2.5" (smaller form factor) u capacity: 73 GB u rotational speed: 10,000 RPM u sequential read performance: 62 MB/s (outer) - 42 MB/s (inner) u seek time (average): 4.3 ms

32 Recent: Seagate Enterprise (2014) l 6TB! 1 Tb/in 2 (announced July) l 6 (3.5”) platters, 2 heads each l Perpendicular recording l 7200 RPM, 4.16ms latency l 216MB/sec sustained transfer speed l 128MB cache l Error Characteristics: u MBTF: 1.4 10 6 hours u Bit error rate: 10 -15 l Special considerations: u Normally need special “bios” (EFI): Bigger than easily handled by 32- bit OSes. u Seagate provides special “Disk Wizard” software that virtualizes drive into multiple chunks that makes it bootable on these OSes.

33 Storage Performance & Price Bandwidth (sequential R/W) Cost/GBSize HHD50-100 MB/s$0.05-0.1/GB2-4 TB SSD 1 200-500 MB/s (SATA) 6 GB/s (PCI) $1.5-5/GB200GB-1TB DRAM10-16 GB/s$5-10/GB64GB-256GB 33 BW: SSD up to x10 than HDD, DRAM > x10 than SSD Price: HDD x30 less than SSD, SSD x4 less than DRAM BW: SSD up to x10 than HDD, DRAM > x10 than SSD Price: HDD x30 less than SSD, SSD x4 less than DRAM 1 http://www.fastestssd.com/featured/ssd-rankings-the-fastest-solid-state-drives/

34 Contrarian View l A lot of file system ideas tied to disk drives are no longer relevant?

35 16 TB SSD announced

36 CSE 153 – Lecture 13 – File Systems36 Disk Scheduling l Because seeks are so expensive (milliseconds!), OS schedules requests that are queued waiting for the disk u FCFS (do nothing) »Reasonable when load is low »Does nothing to minimize overhead of seeks u SSTF (shortest seek time first) »Minimize arm movement (seek time), maximize request rate »Favors middle blocks, potential starvation of blocks at ends u SCAN (elevator) »Service requests in one direction until done, then reverse »Long waiting times for blocks at ends u C-SCAN »Like SCAN, but only go in one direction (typewriter)

37 CSE 153 – Lecture 13 – File Systems37 Disk Scheduling (2) l In general, unless there are request queues, disk scheduling does not have much impact u Important for servers, less so for PCs l Modern disks often do the disk scheduling themselves u Disks know their layout better than OS, can optimize better u Ignores, undoes any scheduling done by OS

38 CSE 153 – Lecture 13 – File Systems38 File Systems l File systems u Implement an abstraction (files) for secondary storage u Organize files logically (directories) u Permit sharing of data between processes, people, and machines u Protect data from unwanted access (security)

39 CSE 153 – Lecture 13 – File Systems39 Files l A file is a sequence of bytes with some properties u Owner, last read/write time, protection, etc. l A file can also have a type u Understood by the file system »Block, character, device, portal, link, etc. u Understood by other parts of the OS or runtime libraries »Executable, dll, souce, object, text, etc. l A file’s type can be encoded in its name or contents u Windows encodes type in name ».com,.exe,.bat,.dll,.jpg, etc. u Unix encodes type in contents »Magic numbers, initial characters (e.g., #! for shell scripts)

40 CSE 153 – Lecture 13 – File Systems40 Basic File Operations Unix l creat(name) l open(name, how) l read(fd, buf, len) l write(fd, buf, len) l sync(fd) l seek(fd, pos) l close(fd) l unlink(name) NT l CreateFile(name, CREATE) l CreateFile(name, OPEN) l ReadFile(handle, …) l WriteFile(handle, …) l FlushFileBuffers(handle, …) l SetFilePointer(handle, …) l CloseHandle(handle, …) l DeleteFile(name) l CopyFile(name) l MoveFile(name)

41 CSE 153 – Lecture 13 – File Systems41 File Access Methods l Different file systems differ in the manner that data in a file can be accessed u Sequential access – read bytes one at a time, in order u Direct access – random access given block/byte number u Record access – file is array of fixed- or variable-length records, read/written sequentially or randomly by record # u Indexed access – file system contains an index to a particular field of each record in a file, reads specify a value for that field and the system finds the record via the index (DBs) l Older systems provide more complicated methods l What file access method do Unix, Windows provide?

42 CSE 153 – Lecture 13 – File Systems42 Directories l Directories serve two purposes u For users, they provide a structured way to organize files u For the file system, they provide a convenient naming interface that allows the implementation to separate logical file organization from physical file placement on the disk l Most file systems support multi-level directories u Naming hierarchies (/, /usr, /usr/local/, …) l Most file systems support the notion of a current directory u Relative names specified with respect to current directory u Absolute names start from the root of directory tree

43 CSE 153 – Lecture 13 – File Systems43 Directory Internals l A directory is a list of entries u u Name is just the name of the file or directory u Location depends upon how file is represented on disk l List is usually unordered (effectively random) u Entries usually sorted by program that reads directory l Directories typically stored in files u Only need to manage one kind of secondary storage unit

44 CSE 153 – Lecture 13 – File Systems44 Basic Directory Operations Unix l Directories implemented in files u Use file ops to create dirs l C runtime library provides a higher-level abstraction for reading directories u opendir(name) u readdir(DIR) u seekdir(DIR) u closedir(DIR) Windows l Explicit dir operations u CreateDirectory(name) u RemoveDirectory(name) l Very different method for reading directory entries u FindFirstFile(pattern) u FindNextFile()

45 CSE 153 – Lecture 13 – File Systems45 Path Name Translation l Let’s say you want to open “/one/two/three” l What does the file system do? u Open directory “/” (well known, can always find) u Search for the entry “one”, get location of “one” (in dir entry) u Open directory “one”, search for “two”, get location of “two” u Open directory “two”, search for “three”, get location of “three” u Open file “three” l Systems spend a lot of time walking directory paths u This is why open is separate from read/write u OS will cache prefix lookups for performance »/a/b, /a/bb, /a/bbb, etc., all share “/a” prefix

46 CSE 153 – Lecture 13 – File Systems46 File Sharing l File sharing is important for getting work done u Basis for communication between processes and users l Two key issues when sharing files u Semantics of concurrent access »What happens when one process reads while another writes? »What happens when two processes open a file for writing? u Protection

47 CSE 153 – Lecture 13 – File Systems47 Protection l File systems implement some kind of protection system u Who can access a file u How they can access it l More generally… u Objects are “what”, subjects are “who”, actions are “how” l A protection system dictates whether a given action performed by a given subject on a given object should be allowed u You can read and/or write your files, but others cannot u You can read “/etc/motd”, but you cannot write to it

48 CSE 153 – Lecture 13 – File Systems48 Representing Protection Access Control Lists (ACL) l For each object, maintain a list of subjects and their permitted actions Capabilities l For each subject, maintain a list of objects and their permitted actions /one/two/three Alicerw- Bobw-r Charliewrrw Subjects Objects ACL Capability

49 CSE 153 – Lecture 13 – File Systems49 ACLs and Capabilities l The approaches differ only in how table is represented u What approach does Unix use? l Capabilities are easier to transfer u They are like keys, can handoff, does not depend on subject l In practice, ACLs are easier to manage u Object-centric, easy to grant, revoke u To revoke capabilities, have to keep track of all subjects that have the capability – a challenging problem l ACLs have a problem when objects are heavily shared u The ACLs become very large u Use groups (e.g., Unix)


Download ppt "CSE 153 Design of Operating Systems Spring 2016 Lecture 13: File Systems."

Similar presentations


Ads by Google