Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mgr inż. Marcin Borkowski Files and Directories UNIX.

Similar presentations


Presentation on theme: "Mgr inż. Marcin Borkowski Files and Directories UNIX."— Presentation transcript:

1 mgr inż. Marcin Borkowski Files and Directories UNIX

2 mgr inż. Marcin Borkowski I/O – Files, pipes, FIFO's and sockets – Access by descriptor (low level) or by stream – Streams are more portable (ISO C standard) – File position ● random access ● on pipe and alike positioning generates ESPIPE error ● is shared by child and parent process if file was opened by parent and descriptor was inherited by the child process ● append for writing, independent reading – Except for terminal output, we will concentrate on low-level I/O

3 mgr inż. Marcin Borkowski I/O – Linked channels ● single open – fork, dup, dup2, fileno, fdopen ● share position in the file ● share files status flags, but not descriptor flags ● if streams are included, they need to be cleaned before channels are unlinked or another channel is about to be opened ● streams are cleared by: – fflush – unbuffered – always – line buffered – at the end of line

4 mgr inż. Marcin Borkowski I/O – Independent channels ● multiple open calls ● independent file position, file and descriptor flags ● independent buffering on streams can disturb operations ● always call fflush on stream before using different channel on the same physical file ● use append mode instead of setting the file position at the end. The latter will not work if other process appends data to the same file at the same time. ● use locking (see low level i/o)

5 mgr inż. Marcin Borkowski Streams – FILE * data type – Standard streams ● open on process start-up ● stdin, stdout, stderr – Useful functions (study man pages): ● fopen, fclose, ● fseek, ftell, fflush ● (f)printf, (f)scanf, fgets ● ungetc, fread, fwrite ● feof, ferror, clearerr

6 mgr inż. Marcin Borkowski Streams – POSIX defines operations on streams as atomic – thread safe (please note that atomic does not mean that the operation cannot be interrupted by signal handling routine) – When several instructions of stream has to be performed without interrupt simple locking is not enough. The following functions are introduced to control locking: ● flockfile, ftrylockfile, funlockfile – One thread can acquire stream lock more than once. The same amount of unlock calls must be performed by the same process to release the file

7 mgr inż. Marcin Borkowski Streams – Buffering ● unbuffered _IONBF ● line buffered _IOLBF (default for interactive terminal) ● fully buffered _IOFBF (default for other streams) ● stdin, stdout and stderr are line buffered ● can be changed before first operation on the stream - function setvbuf ● stdin (and STDIN_FILENO) goes through terminal driver buffer, that can be changed with terminal level functions – Non POSIX functions (do not use) ● fcloseall ● __fpurge

8 mgr inż. Marcin Borkowski Streams ● No difference between binary and text I/O in POSIX compliant systems ● To write program compatible with non-POSIX systems that implement binary and text files in a different way use: – fgetpos and fsetpos functions – binary flag “b” when opening binary streams

9 mgr inż. Marcin Borkowski Low Level I/O – int data type – Standard streams ● open on process start-up ● defined in unistd.h ● STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO – Useful functions (study man pages): ● open, close, fdopen, fileno ● lseek,read, write (see examples) – Process can write at any position after the end of file, the gap is filled with zeros – Never use – very unsafe: gets

10 mgr inż. Marcin Borkowski Low Level I/O – mmap/munmap functions ● easy access ● may reduce available address space ● if mixed with normal read/write functions require proper synchronization (f. msync ) ● protection: – PROT_READData can be read. – PROT_WRITEData can be written. – PROT_EXECData can be executed (shared libraries) – PROT_NONEData cannot be accessed. ● some combinations of above flags can be unimplemented - in such case function call shall fail

11 mgr inż. Marcin Borkowski Low Level I/O – mmap function ● flags: – MAP_SHAREDChanges are shared – MAP_PRIVATEChanges are private – MAP_FIXEDInterpret addr. exactly. ● use of MAP_FIXED is discouraged by POSIX manual, it may be unsupported ● function can be used on random access files, not sockets or pipes ● address and length are page aligned, to check the memory page size call: size_t page_size = (size_t) sysconf(_SC_PAGESIZE);

12 mgr inż. Marcin Borkowski Low Level I/O – System can buffer all disk operations in memory. To make data persistent system buffer must be synchronized with disks: ● function sync schedules synchronization but it does not wait for the end of the process ● function fsync schedules the synchronization and waits for the end of the process – can report IO errors !!! – can be interrupted (EINTR) – if piece of data is modified and can be accessed by other process, it must be synchronized before file lock is released

13 mgr inż. Marcin Borkowski Low Level I/O – Descriptor duplication ● creates linked channel ● is mostly used for redirection of descriptors – dup function makes a copy using the lowest available descriptor ● the same as fcntl (old, F_DUPFD, 0); – dup2 function closes the old descriptor before making the copy ● the same as close (new);fcntl (old, F_DUPFD, new); ● atomic operation, can be interrupted by signal

14 mgr inż. Marcin Borkowski Low Level I/O – File status flags ● access mode ● open-time flags ● operation modes – Set at open call or changed later (except for open-time flags) by fcntl function – Always first read the flags, then modify and update, new flags can appear over time new-flags = fcntl(filedes, F_GETFL) | O_APPEND; fcntl(filedes, F_SETFL, new-flags);

15 mgr inż. Marcin Borkowski Low Level I/O – access mode ● O_RDONLY,O_WRONLY, O_RDWR – open-time flags ● O_CREAT ● O_EXCL only with O_CREAT ● O_TRUNC – depreciated use f. ftruncate – operating modes flags ● O_APPEND – the only way to add at the end of file ● O_DSYNC,O_SYNC,O_RSYNC – do not buffer the data ● O_NONBLOCK – do not wait for the data availability

16 mgr inż. Marcin Borkowski Low Level I/O – File locks ● works on files only ● advisory (voluntary) and mandatory locking ● mandatory locking – is not defined in POSIX ! – mount option and file permissions g+s g-x ● exclusive write lock ● shared read lock, many processes can acquire it at the same time ● read/write do not lock by default ● bytes after the end of file can be locked

17 mgr inż. Marcin Borkowski Low Level I/O ● locks are associated with processes, each byte of file can be locked separately ● child process do not inherit locks ● all locks are released on close, even if there are other descriptors in the same process opened on this file !!! ● exit releases all locks ● process can release lock on different part of file that was locked resulting in two split locks or release two locks at the same time ● function fcntl sets locks (among other useful things) – F_GETLK, F_SETLK – F_SETLKW (waits for lock and can be interrupted)

18 mgr inż. Marcin Borkowski Directories – File names, directories tree (FHS) – Directory names separator (/) – Multiple successive separators (//) are treated as one (/) on GNU systems – File name resolution ● absolute file names ● relative file names ● PATH and executable files ● local execution (./binary ) – Special directories (.) and (..)

19 mgr inż. Marcin Borkowski Directories – Current Working Directory getcwd – Never use – getwd is dangerous – CWD can be changed ● for the process, not parent shell ● chdir function – Directories are special files on most FS-types – Reading of directories is organized in stream like fashion: ● struct dirent – according to POSIX the only valid field is d_name !!! ● the rest of attributes can be read with stat or lstat

20 mgr inż. Marcin Borkowski Directories – opendir opens directory stream ● internal implementation can use regular file descriptor ● process can run out of files limit ● opened DIR* must not be shared by parent and child process – readdir reads next entry from the directory ● returned dirent structure is statically allocated and will be overwritten with next readdir call !!! ● error and end of directory are reported in the same way !!! ● calling code must reset errno to zero to tell error form end of directory (see examples) ● is not thread safe use readdir_r for multi threaded program ● may skip new and removed files added during the scan

21 mgr inż. Marcin Borkowski Directories – closedir closes directory stream – rewinddir resets the stream to the beginning of directory and checks for new directory state (added and removed files) – telldir returns current position in directory ● the position is invalidated by rewinding or reopening the directory – seekdir changes the position in the directory stream, position attribute must be the result of telldir call

22 mgr inż. Marcin Borkowski Directories – Function scandir seems useful but along with alphasort is not part of POSIX standard !!! – Use ftw and nftw functions instead ● callback function for every entry in sub-tree ● limits on descriptors ● callback function cannot change CWD ● nftw can be controlled by flags – FTW_CHDIR -change CWD to the object directory before callback function is executed, CWD is automatically restored afterwards – FTW_DEPTH – depth first, otherwise breadth first – FTW_PHYS – do not follow symlinks ● both ftw and nftw are inherently not thread-safe

23 mgr inż. Marcin Borkowski Directories – Canonical file names ● no symbolic links, ● no (.) and (..) in the path ● no repeated path separators (//) ● function canonicalize_file_name – is GNU extension (non-POSIX) – allocates memory that needs to be released with free ● function realpath – POSIX compliant – requires allocated buffer, no size control – dangerous – can allocate memory like canonicalize_file_name but it is not portable feature, POSIX says “implementation-defined”

24 mgr inż. Marcin Borkowski Directories – Temporary files ● function tmpfile creates temporary file (stream) – no control over location or name of this file – file is deleted on close or exit – if program is terminated abnormally (e.g. by SIGKILL) the file may be left behind – thread safe ● functions tmpnam, tmpnam_r and tempnam generate the name for the temporary file – process must open and close temporary file by itself – the name may be already in use (use O_EXCL | O_CREAT to open) – tmpnam is not thread safe

25 mgr inż. Marcin Borkowski Directories – File attributes ● all but the file name are stored in file's inode ● POSIX limits the members of stat structure to: – st_mode – permitions and type – st_ino – st_dev – st_uid,st_gid – st_atime,st_ctime, and st_mtime – st_nlink ● test st_mode for object type: – S_ISDIR(mode_t m), S_ISREG(mode_t m) – S_ISLNK (mode_t m)

26 mgr inż. Marcin Borkowski Directories – File attributes ● functions stat and fstat read the attributes of given file – when symbolic link is tested, the stat resolves the file name using link information i.e. the linked object is tested – it is impossible for the macro S_ISLNK (mode_t m) to return not zero value if status was obtained from stat function ● function lstat does not follow symbolic links – it is possible to read the attributes of link itself

27 mgr inż. Marcin Borkowski Directories – Process umask ● every time file or directory is created, supplied permission mode is modified by process umask ● bits set in umask are removed from permissions ● Initialized by system for user session ● controlled from command-line (umask command) ● do not change without a good reason, it is supposed to limit the creation permissions, let it do the job ● if necessary use umask function for reading and setting the mask ● getumask funct. is GNU extension (not portable, do not use)

28 mgr inż. Marcin Borkowski Directories – Other useful functions, that need no additional comments, refer to the man pages ● link, unlink ● rmdir, remove ● symlink, readlink ● rename ● mkdir ● chmod, fchmod

29 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to read : ssize_t bulk_read(int fd, char *buf, size_t count) { int c; size_t len=0; do { c = TEMP_FAILURE_RETRY(read(fd, buf, count)); if (c < 0) return c; if (0 == c) return len; buf += c; len += c; count -= c; } while(count > 0); return len ; }

30 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to write : ssize_t bulk_write(int fd, char *buf, size_t count) { int c; size_t len = 0; do { c = TEMP_FAILURE_RETRY(write(fd, buf, count)); if (c < 0) return c; buf += c; len += c; count -= c; } while(count > 0); return len ; }

31 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to redirect standard output to file int pfd;... if (-1 == TEMP_FAILURE_RETRY(close(STDOUT_FILENO))) ERR(); if (-1 == dup(pfd)) ERR(); if (-1 == TEMP_FAILURE_RETRY(close(pfd))) ERR();... ● How to redirect stderr to stdout if (-1 == TEMP_FAILURE_RETRY(dup2(STDOUT_FILENO, STDERR_FILENO))) ERR();

32 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to set write lock : void SetWrLock(int fd, int start, int len) { struct flock l; l.l_whence = SEEK_SET; l.l_start = start; l.l_len = len; l.l_type = F_WRLCK; if (-1 == TEMP_FAILURE_RETRY(fcntl(fd, F_SETLKW, &l))) ERR("fcntl"); }

33 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to read CWD entry by entry: if (NULL == (dirp = opendir("."))) ERR(); do { errno = 0; if ((dp = readdir(dirp)) != NULL) printf("found %s\n", dp->d_name); } while (dp != NULL); if (errno != 0) ERR(); TEMP_FAILURE_RETRY(closedir(dirp));

34 mgr inż. Marcin Borkowski Files and Directories - Examples ● How to analyse dirent structure: if (lstat(filenamepath, &filestat) == -1) error("Cannot read file"); if (S_ISDIR(filestat.st_mode)) dirs++; else if (S_ISREG(filestat.st_mode)) files++; else if (S_ISLNK(filestat.st_mode)) links++; else other++;


Download ppt "Mgr inż. Marcin Borkowski Files and Directories UNIX."

Similar presentations


Ads by Google