1 Accumulative Versioning File System Moraine and Its Application to Metrics Environment Mame Tetsuo Yamamoto * Makoto Matsushita * Katsuro Inoue *,** * Osaka University, Japan ** Nara Institute of Science and Technology, Japan
2 Disc Capacity of 1995 $ Gbytes Linux (Slackware2.2[Basic Applications+X window]) 0.15Gbytes Free Space 0.39Gbytes
3 Disc Capacity of 2000 $ Gbytes Linux (RedHat7[Basic Applications+X window]) 1Gbytes Free Space 39Gbytes 0.39Gbytes 100times 0.15Gbytes 6times
4 Benefits of Large Disc Capacity Ordinary PC users JPEG, MPEG, MP3, HTML,... Our Software Engineers ? We still use software product management scheme which were devised and developed under limited disc capacity in old days.
5 Archiving Software Product Evolution 5 years ago, it was very expensive, practically impossible. Now, it may be possible. Current disks have enough space to archive product evolutions made by each engineer. FreeBSD FreeBSD Current Archive Size 1Gbytes 100Mbytes 270Mbytes Develop using CVS 23 Release Versions
6 Development of Moraine and Mame Based on this motivation, we have devised two things. Automatic SE product archive scheme -> Versioning file system Moraine Open structured, Non-proprietary Quantitative development management framework -> Metrics environment Mame (Moraine As a Metrics Environment)
7 Design Policies of Moraine Automatic Archiving Moraine automatically records the fine-grained versions of all files. Easy Operation Users are not required to learn how to use Moraine. To do this, usual file read/write operations are hooked, and versioning works are performed by Moraine. Open Structure Moraine does not have to have proprietary data repository or versioning tools.
8 Features of Moraine Moraine always shows the only latest version in the repository to the user. Newer Repository Access User Actual file structure Newer Latest version file Users surface view Newer
9 Architecture of Moraine User Kernel HDD User Process VFSVCD RCS UFS Version Management Sub-System -retrieve versions -show a delta Control Commands Vertical File SystemVersion Control Daemon Unix File system VFS is designed as a stackable file system. It is a logical file system above physical file system like UFS. VFS Vertical File System
10 Stackable File System Stackable file system can be constructed without changing existent file system. The output of a stackable file system is a file sent to the lower file system. Stackable file system has a portability.
11 Architecture of Moraine User Kernel HDD User Process VFSVCD RCS UFS Version Management Sub-System -retrieve versions -show a delta Control Commands Vertical File SystemVersion Control Daemon The version management sub-system performs the version create work such as delta computing between versions. VCD Version Control Daemon VCD is a daemon process which acts as a bridge between the kernel and the version management sub-system. Control commands help to manage system behavior.
12 Evaluation of Moraine To know that Moraine is practically acceptable, we have measured system performance and stored data size. Measurement environment We have compared Raw UFS (UNIX file system) with Moraine. Machine: Pentium 166MHZ/48MB RAM We used such a low performance computer to increase comparison sensitivity.
13 Evaluation -Read- We have measured a response time for a UNIX process to read files.
14 Evaluation -Write- We have measured a response time for a UNIX process to write files.
15 Evaluation -Write- We have measured a response time for a UNIX process to write files. Moraine+Sync: time for completion of all write operations
16 Evaluation -Data Size- 1/2 Student1Student2Student3 LOC # of files Final file size (KB) # of all versions # of versions of source codes We have applied Moraine to the student project of our university. Student1Student2Student3 LOC # of files Final file size (KB) # of all versions # of versions of source codes
17 Evaluation -Data Size- 2/2
18 Overviews of Mame Mame uses Moraine to collect metrics data. Mame(Moraine As a Metrics Environment) is an infrastructure for quantitative product measurement of ordinary software development activities.
19 Features of Mame Engineers do not need to change their working environments. The data of various activities on the environment is automatically collected. Mame provides fine-grained data and can be abstracted. The data format is non-proprietary.
20 Architecture of Mame Development environment for software engineers is put above Moraine. Development Environment -Metrics Tools -Version Viewer Read/Write files Development Activities Analysis data HDD VFSVCD UFS RCS Commands Moraine
21 Experiment of Using Mame We have applied Mame to the student project which is the same one used as the evaluation of Moraine. The metrics data was obtained after the project termination. Metrics data The number of files The total lines of codes in the files The number of C functions in the source files
22 Product Evolution 1/2 The horizontal axis is the cumulative number of versions for C source files 147
23 Product Evolution 2/2 The developer first created skeleton of functions, then he filled function bodies. 147
24 Discussions(Moraine) Moraine automatically collects all versions of all files created or modified. We can recover any old files and any old versions if necessary. The performance of Moraine is enough as a basis of software development platforms.
25 Discussions(Mame) There are no specific enforcement to the developers. We can set up or change data collection policy during the project or after the project. Ordinary metrics environments require predetermined data collection policy.
26 Related Work COMPAQ(DEC) OpenVMS The file system is the proprietary system. ClearCase, PVCS, Visual Source Safe Users have to recognize the systems and to issue check-in/check-out commands for the version management. 3D Filesystem Registering a new version requires some supporting tools associated with the file system. Our Moraine does not need any knowledge or commands of archiving for users.
27 Conclusion We presented Moraine that automatically records the history of files. It is very practical. We also presented Mame which collects data of various activities without a burden to the developers. This approach would be one answer to rapid increase of disk space.
28 The End
29 Version Viewer