Paper: “Impact of Software Engineering Research on the Practice of Software Configuration Management Authors: Estublier, Leblang, Hoek, Conradi, Clemm, Tichy, Wiborg- Weber Citation: ACM TOSEM Oct 2005
The Impact Project Provide scientific scholarly answers to: – What impact has academic and industry research really had on the practice of software engineering? – What future impacts should be expected? – What future directions will software research take? How? – ACM Sigsoft project (international) – NSF and Sigsoft funding – EU, Japanese, private funding - Deliverables: journal articles, conference panels
Initial Subject Areas Reviews/Walkthroughs – Dieter Rombach/Dewayne Perry Configuration Management – Jacky Estublier Testing and Analysis – Lori Clarke/David Rosenblum Middleware – Wolfgang Emmerich Process/workflow/lifecycle models – Volker Gruhn Modern Programming Languages – Mary Lou Soffa/Barbara Ryder Requirements Engineering – Anthony Finkelstein/Axel van Lamsweerde Reverse Engineering – Hausi Muller Cost/Economic Models
How do they define Impact? The research must have been: 1.Published – publicly available, AND 2.Incorporated in actual SCM product that are (or were) on the market, commercially or free. Other impacts not considered: - people (graduates) - workshops and conferences
Software Configuration Management is… The discipline of managing change in large, complex software systems. Goals: manage and control corrections, extensions, and adaptations throughout lifetime of software system - Systematic and traceable software development process - Managing files and directories
In the beginning… 1950s Aerospace industry Colored punch cards 1960s Integrated within OS 1970s Separate discipline
Evolution of context of SCM Systems
Construction Building Snapshots Regeneration Optimization (create exec) Construction Building Snapshots Regeneration Optimization (create exec) Auditing History Traceability Logging (archive/rollback) Auditing History Traceability Logging (archive/rollback) Components Versions Configurations Baselines Project contexts (keep track) Components Versions Configurations Baselines Project contexts (keep track) SCM Spectrum of Functionality Susan Dart, SCM-3, 1991 Accounting Statistics Status Reports (gather stats) Accounting Statistics Status Reports (gather stats) Process Lifecycle support Task mgmt. Communication Documentation (choose tasks) Process Lifecycle support Task mgmt. Communication Documentation (choose tasks) Controlling Access control Change requests Bug tracking Partitioning (track change) Controlling Access control Change requests Bug tracking Partitioning (track change) Team Workspaces Merging Families (conflicts) Team Workspaces Merging Families (conflicts) Structure System model Interfaces Consistency Selection (how related) Structure System model Interfaces Consistency Selection (how related) And,…remain universally applicable – PL and App independent
Partition of SCM Approaches Product - Versioning - System Models and selection: support aggregate artifacts – configuration concept Tool - Workspace control: distributed users?, integration of change - Building: executable Process - Support for general development processes to manipulate artifacts
Approaches to Versioning Capture artifacts as configuration items Track relations among items in version graph Edges: -revision-of – seq develop -variant-of - || development -merge variants
When disk space was scarce: Delta storage (eg, SCCS, RCS): baseline+deltas Data compression Combination of delta and compression More accuracy in deltas: –Context-oriented –Operation-oriented –Semantics-oriented –Syntax-oriented But for generality -> classic line-based merging
Advanced Versioning – Change sets How it works: –Each change stored as a delta independently from other changes –Allow more flexibility –Can combine changes as desired Not used in practice: –Deltas overlap/conflict – some combos do not work –For binary objects – cannot combine some deltas Too unwieldy for large projects with large change
Alternative: Change Packages Task, Activity, package, Subproject,… track changes at logical level
Aggregating and Accessing Multiple Artifacts Data Models Early 1970s – SCCS and RCS = file system Since then – on top of commercial database systems Research systems: –Adele, 1985: active, oop, versioned model –Object=any entity –Attribute=primitive, compound, predefined –Relations=model associations like derivation, dependency, composition More advance commercial system: Aide-De-Camp 1989
Aggregating and Accessing Multiple Artifacts System Models – late 1970s onward –MIL (Module interconnection language) to describe system structure Model interfaces – provided, required functions Behaviors - Pre and post conditions Hierarchical construction of modules system architecture; UML –Integrate into SCM – users can manage real organization of software –Major Problem – keeping evolution of model and implementation versions in synch
Aggregating and Accessing Multiple Artifacts Selection –How do I get a set of artifacts in my workspace without requesting them individually? –Default: All latest version in workspace. Fetch the rest individually. –Other approaches: Hierarchical workspaces local, parent,… General queries –(status = approved) AND owner = Jacky) OR (date > ) Leverage change-sets –Baseline bug-fix bugfix.2 + feature-12 Rule-based 88, 94. –First, my checked-out versions –Otherwise, the latest versions on my branch –Otherwise, the latest versions on the main branch
A Typical Development Scenario CM repository Pete’s workspace CBA Ellen’s workspace ECD
Workspace Control 3 functions of the workspace: 1.Sandbox – freely edit. May be locks. 2.Building – expand compressed files, keep compiled/derived objects 3.Isolation – allow developer to make changes, compile, test, debug without interference
Workspace Control Classic SCCS and RCS Systems – no workspace management CVS – first scripts on top of RCS Need: avoid source file copies in 100s of workspaces Sun/Forte Teamware – manage projects of subprojects Virtual workspaces – only copies of files editing ClearCase – avoid recompiling sources on builds
Building Make, – dependencies, date-based rebuild, fast. Improvement - Rebuild only if any source versions now in workspace are not exactly the same as in last build. –BOMs – bill of materials for each target object built Language-based smart rebuild: semantic changes and dependencies Winking-in (ClearCase) – language independent, reuse binaries across workspaces
Process Support Software Process = sequence of activities during creation and evolution -Change control: -Change request (requirement change) -Trouble report (malfunction issue)
Landmark Contributions with Great Impact AcademicIndustry 1972SCCE 1976Diff 1977Make 1980Variants, RCS 1980Change sets (Aide-de- Camp) 1982Merging, and/or graph 1984Selection 1985System model st SCM workshop, process support, workspaces 1990Virtual file system, multisite 1996Activity-oriented SCM
Successful Transitions SCCE, Make, RCS – immediate, long lasting impact Change sets – slow, nonpractical, but standard feature – change packages. Process support – advanced support for modeling and enforcing process Differencing/merging – binary deltas, not semantic-based Distributed/remote development – client- server protocol, web-based interfaces
Failed Transitions Semantic-based recompilation – language dependent Advanced systems models – more power than needed Generic platform - research has focused on managing source code only. Too much needed for extra artifacts.
Summary -High impact Research – useful, ease of use by developer, generality -Low impact Research – level of complexity too high, not easy to master idea as a feature
What is Next for SCM? How to fit SCM with rest of development process/tools? –Manage other artifacts beyond source code –Maybe not be language independent
Recognizing a Valuable Resource: Mining Software Repositories Configuration management repositories are traditionally a “depot” –occasional roll-back –occasional search for relevant information But what if we used the information captured by configuration management repositories to our advantage –understanding software developers –helping software developers