Presentation is loading. Please wait.

Presentation is loading. Please wait.

Avishai Wool lecture 12 - 1 Introduction to Systems Programming Lecture 12 File Systems.

Similar presentations


Presentation on theme: "Avishai Wool lecture 12 - 1 Introduction to Systems Programming Lecture 12 File Systems."— Presentation transcript:

1 Avishai Wool lecture 12 - 1 Introduction to Systems Programming Lecture 12 File Systems

2 Avishai Wool lecture 12 - 2 Long-term Information Storage Computers have long-term storage devices (hard disks, floppy disks, tapes, CDROM, DVD) The hardware & disk-driver expose a block interface: “read/write 512 bytes from block NN” This is too crude for most applications. Next level of organization: organize the disk into Files

3 Avishai Wool lecture 12 - 3 What is a File? A computer file is a collection of information that can be identified and referenced in its entirety by a unique name. [Wikipedia] Unlike the disk, files can: –Get created, deleted, and renamed –Grow larger and smaller The OS maintains the mapping between files and storage units This work is done by an OS component called a “File System”

4 Avishai Wool lecture 12 - 4 File Names Unix –long file names (255 bytes) –include special characters –case sensitive (Avishai.txt != avishai.txt) MS-DOS –file name = 8 chars, ‘.’, 3-char extension –case insensitive (Grades.txt, grades.txt, GRADES.txt, all the same file)

5 Avishai Wool lecture 12 - 5 File Structure In Unix, Windows, a file has no structure (that the OS knows about). It is just a sequence of characters. A file can use part of a disk block A program can impose its own organization inside a file. OS does not know where the lines start/end! Conventions for text files: –Unix: lines end with “Line feed” (LF; 0x0A) –DOS/Windows: lines end with “Carriage return + line feed” (CR+LF, 0x0D 0x0A)

6 Avishai Wool lecture 12 - 6 File Access Sequential access –Read all bytes, in order, from the beginning –Cannot jump around (too much) – can rewind or back up –Efficient: disks are optimized for this access pattern Random access –Read bytes in any order –Essential for data base systems –OS maintains a “file marker” for open files – this is the offset to the next byte to be read –“Seek” system call moves the file marker

7 Avishai Wool lecture 12 - 7 File Attributes Windows

8 Avishai Wool lecture 12 - 8 File operations / system calls Create Delete Open Close Read Write Append Seek Get attributes Set attributes Rename

9 Avishai Wool lecture 12 - 9 Hierarchical Directory Systems

10 Avishai Wool lecture 12 - 10 Path Names Absolute path name: –Unix: /usr/home/yash/teaching/isp.txt –Windows: c:\My Documents\teaching\isp.txt Relative path name –use the “current directory” as a starting point –“.” (dot) for “this directory” –“..” (dotdot) for “parent of this directory” –if current directory is /usr/home/yash/research:../teaching/isp.txt is the same file as

11 Avishai Wool lecture 12 - 11 A Unix Program Using File System Calls (1/2) Called by “copyfile abc xyz”

12 Avishai Wool lecture 12 - 12 A Unix Program Using File System Calls (2/2)

13 Avishai Wool lecture 12 - 13 Memory-Mapped Files Some operating systems allow another interface: char *p = mmap (filename, …) Maps the contents of the file to a memory area Now p[0], p[1], … are the bytes of the file (implicit open & read of the whole file) p[3] = ‘a’ puts the character ‘a’ into the file munmap (…) Write the file to back to disk and close

14 Avishai Wool lecture 12 - 14 Under the hood mmap() call does not read the file. Makes the file the “backing store” of the pages of the memory area Accessing the memory area causes page-faults that bring the data into or out of memory munmap() flushes all the pages to disk

15 Avishai Wool lecture 12 - 15 Properties of memory-mapped files Can’t extend the file – –On Unix the length is specified in the call to mmap() –Operating system determines exact semantics Can be used to share memory between processes: –Both processes mmap() the same file –See each other’s changes to the content –Need concurrency control: semaphore, mutex, etc… Dangerous to access a file via mmap() and fread() at the same time: –Contents are unpredictable until munmap() !

16 Avishai Wool lecture 12 - 16 File System Implementations

17 Avishai Wool lecture 12 - 17 A possible file system layout Structure of disk on all PCs Internal structure of a partition of a Unix file-system

18 Avishai Wool lecture 12 - 18 Basic disk organization MBR = Master Boot Record. Sector 0 of the disk. Contains the initial program loaded at power-up. Partition table = divides the physical disk into logical disks (C:, D:, etc) Each partition is viewed as a sequential array of numbered blocks, modeling the physical sectors

19 Avishai Wool lecture 12 - 19 Files: Contiguous allocation (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed

20 Avishai Wool lecture 12 - 20 Properties of contiguous allocation Simple: need to keep start position and size High performance: to read whole file, just one seek followed by sequential reads But: over time disk becomes fragmented. A good design for CD-ROM and DVD: write once, file sizes known in advance.

21 Avishai Wool lecture 12 - 21 Files: linked list The links are on the disk (using block numbers)

22 Avishai Wool lecture 12 - 22 Properties of linked-list files No fragmentation but: –sequential access within a file slows down if file blocks not consecutive –random access very slow (how to do “fseek to byte 100,000 in this file?”)

23 Avishai Wool lecture 12 - 23 Linked list: File Allocation Table Keep the list links in a separate table in memory

24 Avishai Wool lecture 12 - 24 Properties of a FAT Random access much better: linked list is in memory so no need to do disk access for every link but: –whole FAT needs to be in memory to be efficient –FAT can be large

25 Avishai Wool lecture 12 - 25 i-nodes

26 Avishai Wool lecture 12 - 26 Properties of i-nodes Instead of a global file table, each file’s i-node keeps track of its own blocks. Only need to keep i-node in memory if the file is open. Total memory (RAM) needed proportional to (size of i-node) x (max number of open files)

27 Avishai Wool lecture 12 - 27 Example file systems

28 Avishai Wool lecture 12 - 28 The MS-DOS file system File names 8+3 (UPPERCASE) No ownership: all files accessible to user Maintained via a File Allocation Table (FAT) Attributes: –read-only –hidden –system –archive

29 Avishai Wool lecture 12 - 29 The MS-DOS directory entry Time is inaccurate: 2 bytes = 65536, but 86400 seconds per day Date uses 7-bit for year, starting 1/1/1980. Runs out in 2107 First-block-number: index into FAT, with 64K entries 10 bytes (of 32) unused!

30 Avishai Wool lecture 12 - 30 FAT-12/16 Block (also called cluster), multiple of 512 bytes. FAT-12: 12-bit block addresses, 512-byte blocks –largest partition: 4096 x 512 = 2MB. OK for floppy For disks, MS allowed blocks of 1KB, 2KB, 4KB. Largest partition: 16MB FAT-16: switch to 16-bit addresses, block size up to 32KB. –Largest partition 2GB

31 Avishai Wool lecture 12 - 31 FAT-32 a)Win95 2nd Edition / Win98 / Win ME b)Really FAT-28: 28-bit block addresses c)Potentially 2 28 x 2 15 per partition, but in reality only 2 41 = 2TB d)FAT itself now occupies a large RAM: a)for 2GB disk, 4KB blocks  512K blocks  FAT uses 2MB RAM.

32 Avishai Wool lecture 12 - 32 File system compatibility The Win95 2e / Win98 file system added: –FAT-32 –long file names But needed to allow older MS-DOS & Win95 to read directories (backward compatibility). Result: –every file has 2 names (one 8+3, one long) –directory entries needed to be patched –Try “dir /x” in a cmd window to see… –Still case-insensitive under the hood…

33 Avishai Wool lecture 12 - 33 The Unix V7 file system

34 Avishai Wool lecture 12 - 34 The Classical Unix file system Invented with Unix V7 for PDP-11 (1970’s) 14-character names all ASCII chars except ‘/’ and NUL (0x00) every file has a 2-byte i-node number –at most 64K files per file-system Allows some weird filenames: –“ ” (an empty space) –“ ” (a “newline” character)

35 Avishai Wool lecture 12 - 35 A UNIX V7 directory entry

36 Avishai Wool lecture 12 - 36 Reminder: Disk organization

37 Avishai Wool lecture 12 - 37 A UNIX i-node Max file size: d direct pointers, n indirect pointers p/block Blocksize * (d + n + n 2 + n 3 )

38 Avishai Wool lecture 12 - 38 The steps in looking up /usr/ast/mbox

39 Avishai Wool lecture 12 - 39 BSD Improvements File names extended to 255 chars Divide disk into cylinder groups, try to keep i- node and file close together to avoid long seeks. Use 2 block sizes, one for large files, one for small files. Similar improvements also in the Linux file system (ext2).

40 Avishai Wool lecture 12 - 40 The Win2000 (NTFS) file system

41 Avishai Wool lecture 12 - 41 NTFS Designed from scratch Not compatible with Win95 / Win98 Usually 4KB blocks (clusters) Blocks referred to by 64-bit numbers Main data structure: Master File Table (MFT) Each MFT entry describes a file or directory MFT entry = 1KB MFT is a file, can be anywhere on disk

42 Avishai Wool lecture 12 - 42 Block runs Idea: blocks of a file often sequential on disk A “run” is a set of consecutive blocks that belong to the same file No need to keep pointer to each block: –Enough to keep start/length of each run

43 Avishai Wool lecture 12 - 43 An MFT record for a 3-run, 9-block file MFT

44 Avishai Wool lecture 12 - 44 Concepts for review File File name File structure Sequential access / Random access File attributes Hierarchical directories Path names Memory-mapped files Master boot record (MBR) Partition table Contiguous block allocation File allocation table (FAT) i-node NTFS MFT


Download ppt "Avishai Wool lecture 12 - 1 Introduction to Systems Programming Lecture 12 File Systems."

Similar presentations


Ads by Google