1 System Programming Chapter 3 File I/O
2 Announcement The first exercise is due today. We will NOT accept late assignment this time. –Submission site is up. –Sign up your account, activate your account, and submit your exercise on-line if possible. –You can also hand in your exercise to TAs. MP1 is out and is due on March 21 st.
3 Fun Project Topobo (MIT media lab) A new way of “programming” without “writing programs”?
4 Tips for Writing Good Codes – Design to Test Avoid the errors is much easier than fixing the errors. Tips: –Writing accessible code: be prepared for giving your code away when you start. Document your code. Giving the variables good names, or things can get very confusing, very fast, as the cognitive dissonance interferes with your brain's normal processing. –InputFromFile is better than buf1/data_2. –validsize(InputNum) is better than checksize(InputNum) –Automatically test your code not manually –Writing unit test
5 Writing Unit Tests Write a test module for your function. Example: double mySqrt(double num). What to test your function? –Pass in a negative argument and ensure that it is rejected. –Pass in an argument of zero to ensure that it is accepted. –Pass in values between zero and the maximum expressible argument and verify that the different between the square of the result and the original argument is less than some value epsilon.
6 24 void testValue(double num, double expected) { 25 double result = mySqrt(num); 26 if (num < 0) { 27 // Should be NaN if negative arg 28 assert(isnan(result)); 29 } 30 else { 31 // Should be within tolerance otherwise 32 assert(fabs(expected-result) < epsilon); 33 } 34 }
7 Contents 1.Preface/Introduction 2.Standardization and Implementation 3.File I/O 4.Standard I/O Library 5.Files and Directories 6.System Data Files and Information 7.Environment of a Unix Process 8.Process Control 9.Signals 10.Inter-process Communication
8 File I/O Objective of this chapter: –Functions available for file I/O –Atomic operations in multiprogramming environments Unbuffered I/O –Popular functions: open, close, read, write, lseek, dup, fcntl, ioctl –Each read() and write() invokes a system call!
9 File I/O File –A sequence of bytes Directory –A file that includes info on how to find other files.
10 File I/O Path name –Absolute path name Start at the root / of the file system /user/john/fileA –Relative path name Start at the “current directory” which is an attribute of the process accessing the path name../dirA/fileB Links (wait till later chapters)
11 File I/O File Descriptor –Non-negative integer returned by open() or creat(): 0.. OPEN_MAX Virtually un-bounded for SVR4 & 4.3+BSD –Per-process base –POSIX.1 – 0: STDIN_FILENO, 1: STDOUT_FILENO, 2: STDERR_FILENO Convention employed by the Unix shells and applications
12 File I/O - File Manipulation Operations –open, close, read, write, lseek, dup, fcntl, ioctl, trunc, rename, chmod, chown, mkdir, cd, opendir, readdir, closedir, etc. File descriptor Read(4, …) Tables of Opened Files (per process) System Open File Table In-core v-node list i-node sync data block
13 File I/O – open() #include int open(const char*pathname, int oflag, …/*, mode_t mode */); File/Path Name –PATH_MAX, NAME_MAX –_POSIX_NO_TRUNC -> ENAMETOOLONG if error occurs (NAME_MAX or PATH_MAX). O_RDONLY, O_WRONLY, O_RDWR O_APPEND, O_TRUNC, (O_NOCTTY) O_CREAT, O_EXCL -> (atomicity) O_NONBLOCK O_DSYNC (write data only), O_RSYNC (read waits for write completion), O_SYNC (write data & attr)
14 File I/O – creat() and close() #include int creat(const char*pathname, mode_t mode); = open(pathname, O_WRONLY | O_CREAT | O_TRUNC, mode) Only for write-access #include int close(int filedes); All open files are automatically closed by the kernel when a process terminates.
15 Why need atomicity? fd = open(“foo”, O_CREAT | O_EXCL, mode); Can be rewritten to … if ((fd = open(“foo”, O_WRONLY)) < 0) { // multiprogramming can do something bad unexpectedly if (errno == ENOENT) { if ((fd = creat(“foo”, mode)) < 0) { err_sys(“create error”); } ….. One complex operation with multiple functiion calls -> all steps or none executed
16 File I/O - lseek #include off_t lseek(int filedes, off_t offset, int whence); Change the offset in a file descriptor whence: SEEK_SET( beginning ), SEEK_CUR, SEEK_END Example: fd = open(); lseek(fd, 100, SEEK_SET) (Say you lose track) how do you find the current offset? currpos = lseek(fd, 0, SEEK_CUR) off_t: typedef long off_t; /* 2 31 bytes */ –or typedef longlong_t off_t; /* 2 63 bytes */ No I/O takes place until next read or write.
17 File I/O - lseek Program 3.1 – Page 52 –Test if “standard input” is capable of seeking? –cat < /etc/motd | seek cannot seek a FIFO or pipe (EPIPE) –seek < /var/spool/cron/FIFO Program 3.2 – Page 53, hole creating! –od –c file.hole a b c \0 \0 \n
18 File I/O – read and write #include ssize_t read(int filedes, void *buf, size_t nbytes); Less than nbytes of data are read: –EOF, terminal device (line-input), network buffering, record-oriented devices (e.g., tape) –Offset is increased for every read() – SSIZE_MAX #include ssize_t write(int filedes, const void *buf, size_t nbytes); –Write errors for disk-full or file-size-limit causes. –When O_APPEND is set, the file offset is set to the end of the file before each write operation.
19 File I/O - Efficiency Program 3.3 – Page 56 –No needs to open/close standard input/output –Copy stdin to stdout (> /dev/null) –Try I/O redirection in reading an 1.4M file BuffersizeUsrCPUSysCPUClock#loops 123.8s397.9s423.4s s6.6s7.0s s1.0s1.1s s0.6s0.6s s0.3s0.3s s0.3s0.3s12
20 File I/O – Sharing Table per process: filedes flags ( close-on-exec ), a pointer Sys open file table: file status( open flags ), offset, a v-node pointer V-node (V = virtual) –i-node: owner, file size, residing device, block ptr,.. Read(4, …) Tables of Opened Files (per process) System Open File Table In-core V-node list i-node sync data block
21 File I/O – Sharing Each “independently opened file” has its offset. Examples –Write offset is incremented! –O_APPEND offset = current file size before each write –lseek() causes no I/O (only on the system open file table) –fork() & dup()causes the sharing of entries in the (system open) file table. –filedes flags versus file status flags filedes flags Tables of Opened Files (per process) file status / offset System Open File Table file size In-core i-node list
22 Why needs Atomic Operation? Atomic Operation –Composed of multiple steps (function calls)? Say you want to append to the end of a file What can go wrong in multiple processes? if (lseek(fd, 0L, 2) < 0) err_sys(“lseek err”); if (write(fd, buf, 10) != 10) err_sys(“wr err”); Problem solved with O_APPEND fd = open(pathname, O_WRONLY | O_APPEND, mode); write(fd, buf, 10)
23 File I/O – pread and pwrite #include ssize_t pread(int filedes, void *buf, size_t nbytes, off_t offset); ssize_t pwrite(int filedes, const void *buf, size_t nbytes, off_t offset); Read and write from a filedes at a “specified” offset. Why create them? (= lseek + read/write ?) Do not update the file offset Atomic operation
24 File I/O – dup and dup2 #include int dup(int filedes); int dup2(int filedes, int newfiledes); dup(): duplicate filedes & return the lowest available filedes. dup2(): duplicate filedes to newfiledes –What if newfiledes is already open? –close(newfiledes); fcntl(fildes, F_DUPFD, newfiledes); // atomic Tables of Opened Files (per process) System Open File Table In-core i-node list
25 File I/O: sync(), fsync(), fdatasync() Kernel maintains a buffer cache between apps & disks. #include int fsync(int filedes); // data + attr sync int fdatasync(int filedes); // data sync void sync(); // returns immediately Apps write() Kernel buffer cache Disks Disk DD queue
26 File I/O - fcntl #include int fcntl(int filedes, int cmd, … /* int arg */); Changes the properties of opened files F_DUPFD: duplicate an existing file descriptor (>= arg). –FD_CLOEXEC is cleared (for exec()). F_GETFD, F_SETFD: file descriptor flag, e.g., FD_CLOEXEC F_GETFL, F_SETFL: file status flags –O_APPEND, O_NOBLOCK, O_SYNC, O_ASYNC, O_RDONLY, O_WRONLY … val = fcntl(fd, F_GET_FL, 0); val |= flags; fcntl(fd, F_SETFL, val) F_GETOWN, F_SETOWN
27 File I/O - fcntl Figure 3.10 –Print file flags for a specified descriptor Figure 3.11 –Turn on one or more flags val &= ~flags: clear the flag. set_fl(STDOUT_FILENO, O_SYNC);
28 Synchronization (Linux ext2) OperationsClock Time Normal write() to disk6.87 write() to disk file with O_SYNC set6.83 write() to disk followed by fdatasync()18.28 write() to disk followed by fsync()17.95 write() to disk with O_SYNC set followed by fsync() 17.95
29 Synchronization (Mac OX X) OperationsClock Time Normal write() to disk14.40 write() to disk file with O_FSYNC set22.48 write() to disk followed by fsync()14.12 write() to disk with O_FSYNC set followed by fsync() 22.12
30 File I/O - ioctl #include int ioctl(int filedes, int request, …); Catchall for I/O operations (not just disk I/O) –E.g., setting of the size of a terminal’s window. SVR4 prototype More headers could be required: –Disk labels ( ), file I/O ( ), mag tape ( ), socket I/O ( ), terminal I/O ( )
31 File I/O - /dev/fd /dev/fd/n –open(“/dev/fd/n”,mode) duplicate descriptor n (assuming that n is open) open(“/dev/fd/0”,mode) same as fd=dup(0) The new mode must be a subset of that of the referenced file. –Uniformity and Cleanliness! cat /dev/fd/0
32 What did I learn today? What can you do with File I/O? Why is atomicity needed? What is the impact of sync on I/O performance?