7.1 Advanced Operating Systems Versioning File Systems Someone has typed: rm -r * However, he has been in the wrong directory. What can be done? Typical UNIXes and Windows versions have some tools for restoring deleted files, if the file's blocks have not been reclaimed. Is this release of storage by UNIX and Windows essential?
7.2 Advanced Operating Systems The File System's problem Key problem with current approach is that user actions have immediate and irrevocable effect on the disk storage. –Users are not protected against their own mistakes. Goes against the file system objective of protecting data against failure. We can do better today.
7.3 Advanced Operating Systems Disk Capacity On 1995: –For $200 you can get a 0.54GB disk. –Slackware Linux 2.2 (Basic Applications+X window) is 0.15Gbytes which are 28% of the disk. On 2000: –For $200 you can get a 40GB disk. –RedHat Linux 7 (Basic Applications+X window) is 1Gbytes which are 2.5% of the disk.
7.4 Advanced Operating Systems Disk Capacity (Cont.) On 2004: –For $200 you can get a 300GB disk. –RedHat Linux Advanced Workstation 2.1 (Basic Applications+X window) for the Itanium Processor is 4.2GB which are 1.4% of the disk. On 2011: –For $200 you can get a 2TB disk. –RedHat Enterprise Linux 5 (Basic Applications+X window) is 8.8GB which are 0.4% of the disk.
7.5 Advanced Operating Systems Old Solutions UNIX has RCS and CVS for maintaining versions of files. –The manual operation is the main disadvantage of these tools. On 1985 the Cedar file system has been proposed. –Cedar automatically retains the last few versions of a file in a copy-on-write fashion. –The number of copies is limited; hence when a new write is done, the oldest version will be deleted. The user can explicitly delete a version, so the oldest version will not be the victim. VMS uses a version of the Cedar File System.
7.6 Advanced Operating Systems Snapshots Many systems are regularly backed up within the disk. The backup is usually incremental. Changes made between snapshots cannot be undone. –Many users maintain multiple versions of their critical data. All files are treated equally.
7.7 Advanced Operating Systems Not all files are created equal Read-only files (like application executables) have no versions history. Derived files (like object files) can be easily reconstituted. Cached files require no version history. Temporary files might benefit from a short- term history but not from a long-term history. User-modified files would benefit most from a long-term and a short-term history.
7.8 Advanced Operating Systems The Elephant File System Elephant (1999) maintains multiple versions of user files, but not all versions of all files –Need a retention policy. Elephant involves the user in the retention/reclamation decisions. This means: –Less protection from user mistakes. –A retention policy that might be better suited to the users’ needs. Elephant keeps a complete history of a file over a short period of time (one hour to one week), but keeps forever landmark versions of each file.
7.9 Advanced Operating Systems Elephant's Main Concepts Storage reclamation is separated from file write and delete. Files have a variety of retention policies. Policies are specified by the user, but implemented by the system. Undo requires complete history for a limited period of time, but long-term histories should not retain all versions. The file system assists the user in deciding what versions to retain in the long-term history.
7.10 Advanced Operating Systems Landmark Versions Elephant detects landmark versions by looking at time line of updates to the file. –Can identify groups of updates separated by long periods of stability. –Last versions of each group of updates are assumed to be landmark versions. User ability to recognize landmark versions of a file degrades with time. –Thus, landmark versions are automatically specified by Elephant. –Even though, user can manually specify any version as a landmark version.
7.11 Advanced Operating Systems Elephant's Versioning The user can set the limit between the recent history (save any version) and the old history (save landmark versions). File versions are named by combining the file pathname with a creation date and time. Directories can be versioned as well. –Allows recovery of deleted files. Previous versions of a file or a directory are read-only.
7.12 Advanced Operating Systems Retention Policies Keep One: keeps only the latest version of the file. –Identical to UFS and FAT. Keep All: keeps all versions of the file. –Useful for very important files. Keep Safe: keeps all versions of the file during a specific period. –Can be used for log files. Keep Landmarks: keeps all versions of the file during a specific period and only landmark versions after that. –Useful for common user's files.
7.13 Advanced Operating Systems I-map I-map is a new structure points to the I-node of the current version and the vector of the old versions (I-node log). In addition, I-map contains the file's policy. By default the policy is "keep one". Common blocks of some versions can be pointed to by several I-nodes. –Changes are detected at the block level. New system calls have been added to handle the new file system's features.
7.14 Advanced Operating Systems Elephant's Performance open() of an exiting file and close() without flushing can be executed almost in same run- time of traditionally UFS. –close() with flushing will be slower. creat() of Elephant is slower. –Should allocate an I-map in addition to the I-node on the disk. unlink() of Elephant is faster. –No release of old blocks. Elephant is much more disk space consuming.
7.15 Advanced Operating Systems The Moraine File Systems On 2000 Yamamoto suggested to compress the versioning data. In addition his versioning file system has software engineering tools: –The Moraine has a version viewer tool runs on a separate window. –The Moraine can also tell how many lines and how many functions any version has.
7.16 Advanced Operating Systems The Version Viewer of Moraine Rev is the version ID. +n means n lines were added while –n means n lines were deleted. The line bar indicates the size of the file. The user can put a remark in TAG.
7.17 Advanced Operating Systems CVFS On 2003 Soules introduced The Comprehensive Versioning File System (CVFS). CVFS keeps the versions of all files in a journal- based style. –CVFS saves the changes; not the new data. –To create old versions, each change is undone backward through the journal until the desired version is recreated. –Rather than saving the blocks that have been changed, CVFS keeps the bytes that have been changed. CVFS is very efficient in disk space, but inefficient in recover time.