Program Comprehension1 Program Comprehension During Software Maintenance and Evolution IEEE Computer, August 1995 Von Mayrhauser, A. and Vans, A. M.
Program Comprehension2 Outline l Overview l Mental models l Cognitive models l Tools l Research topics
Program Comprehension3 Program comprehension l Program comprehension is the study of how software engineers understand programs. l Program comprehension is needed for: Debugging Code inspection Test case design Re-documentation Design recovery Code revisions
Program Comprehension4 Program comprehension process l Involves the use of existing knowledge to acquire new knowledge about a program. l Existing knowledge: Programming languages Computing environment Programming principles Architectural models Possible algorithms and solution approaches Domain-specific information Any previous knowledge about the code l New knowledge: Code functionality Architecture Algorithm implementation details Control flow Data flow
Program Comprehension5 Comprehension techniques l Reading by step-wise abstraction Determine the function of critical subroutines, work through the program hierarchy until the function of the program is determined. l Checklist-based reading Readers are given a checklist to focus their attention on particular issues within the document. Different readers were given different checklists, therefore each reader would concentrate on different aspects of the document. l Defect-based reading Defects are categorized and characterized (e.g., data type inconsistency, incorrect functionality, missing functionality, etc.) A set of steps (a scenario) is then developed for each defect class to guide the reader to find those defects. l Perspective-based reading Similar to defect-based reading, but instead of different defect classes, readers have different roles (tester, designer and user) to guide them in reading.
Program Comprehension6 Reading by step-wise abstraction l Determine the function of critical subroutines, works through the program hierarchy until the function of the program is determined. l A bottom-up strategy – map the code to suggested problem domain activity. l Empirical validation (Basili & Selby ’87) The technique detects more software faults, and has a higher fault detection rate than functional or structural testing.
Program Comprehension7 Defect-based reading l Defects are categorized and characterized, a set of questions developed for each defect class to guide the reader. l Experiments conducted at the University of Maryland (Porter, Basili and Votta, ’95) suggest that defect-based reading is more effective to ad hoc reading.
Program Comprehension8 Perspective-based reading l Similar to defect-based reading, but instead of defects readers have different roles (tester, designer and user) to guide them in reading. l Experiments conducted at the University of Maryland (Laitenberger ’95) suggest that defect- based reading is more effective to ad hoc reading. l Perspective-based reading has been applied to the inspection of requirements documents.
Program Comprehension9 Sources of variation l Aside from the issue of how comprehension occurs, comprehension performance and effectiveness are affected by many factors: Maintainer characteristics Program characteristics Task characteristics
Program Comprehension10 Maintainer characteristics l Familiarity with code base l Application domain knowledge l Programming language knowledge l Programming expertise l Tool expertise l Individual differences
Program Comprehension11 Program characteristics l Application domain l Programming domain l Quality of problem to be understood l Program size and complexity l Availability and accuracy of documentation
Program Comprehension12 Task characteristics l Task type Experimental: recall, modification Perfective, corrective, adaptive, reuse, extension. l Task size and complexity l Time constraints l Environmental factors
Program Comprehension13 Outline l Overview l Mental models l Cognitive models l Tools l Research topics
Program Comprehension14 Models l Mental models Internal working representation of the software under consideration. l Cognitive models Theories of the processes by which software engineers arrive at a mental model. Program Mental Model Cognitive Model
Program Comprehension15 Mental models l Static elements Text structure knowledge Microstructure Chunks (macrostructure) Plans (objects) Hypotheses l Dynamic elements Strategies (chunking and cross-referencing) l Supporting elements Beacons Rules of discourse
Program Comprehension16 Text structure l The program text and its structure Control structure: iterations, sequences, conditional constructs Variable definitions Calling hierarchies Parameter definitions l Microstructure – actual program statements and their relationships.
Program Comprehension17 Chunks l Contain various levels of text structure abstractions. l Also called macrostructure. l Can be identified by a descriptive label. l Can be composed into higher level chunks.
Program Comprehension18 Plans (objects) l Knowledge elements for developing and validating expectations, interpretations, and inferences. l Include causal knowledge about information flow and relationships between parts of a program. l Programming plans Based on programming concepts. Low level: iteration and conditional code segments. Intermediate level: searching, sorting, summing algorithms; linked lists and trees. High level l Domain plans All knowledge about the problem area. Examples: problem domain objects, system environment, domain-specific solutions and architectures.
Program Comprehension19 Hypotheses l Conjectures that are results of comprehension activities that can take seconds or minutes to occur. l Three types: Why – hypothesize the purpose/rationale of a function of design choice. How – hypothesize the method for accomplishing a certain goal. What – hypothesize classification. l Hypotheses are drivers of cognition. They help to define the direction of further investigation. l Code cognition formulates hypotheses, checks them whether they are true or false, and revises them when necessary. l Hypotheses fail for several reasons: Can’t find code to support a hypothesis. Confusion due to one piece of code satisfying different hypothesis. Code cannot be explained.
Program Comprehension20 Supporting elements l Beacons Cues that index into existing knowledge. A swap routine can be a beacon for a sorting function. Experienced programmers recognize beacons much faster than novice programmers. Used commonly in top-down comprehension. l Rules of discourse Rules that specify programming conventions. Examples: coding standards, algorithm implementations, expected use of data structures.
Program Comprehension21 Mental models – dynamic elements l Strategies Sequences of actions that lead to a particular goal. l Actions Classify programmer activities implcitly and explicitly during a maintenance task. l Episodes Sequences of actions. l Processes Aggregations of episodes.
Program Comprehension22 Strategies l Guide the sequence of actions while following a plan to reach a goal. l Match programming plans to code. Shallow reasoning – do not perform in-depth analysis; stop upon recognition of familiar idioms and programming plans. Deep reasoning – perform detailed analysis. l Mechanisms for understanding Chunking Cross-referencing
Program Comprehension23 Chunking l Creates new, higher-level abstraction structures l Labels replace the detail of the lower level chunks.
Program Comprehension24 Cross-referencing l Map program parts to functional descriptions temp = a; a = b; b = temp; for (i=0; i<size; i++) if (array[i]==target) return true; swap sequential search
Program Comprehension25 Outline l Overview l Mental models l Cognitive models l Tools l Research topics
Program Comprehension26 Cognitive models l Letovsky l Shneiderman and Mayer l Brooks l Soloway, Adelson and Ehrlich l Pennington l Mayrhauser and Vans (Integrated)
Program Comprehension27 Letovsky model Prior knowledge Mental model: Specification layer Implementation layer Mapping Unresolved mappings Opportunistic assimilation process (top-down or bottom-up, as needed)
Program Comprehension28 Shneiderman model e.g., Programming languages e.g., Programming experience Programs are encoded into chunks Chunks are collected into a layered internal representation Developed for both design and comprehension activities.
Program Comprehension29 Brooks model Bridges the gap between problem domain & program domain Relate objects between different domains & representations Top-down process: hypotheses generated and successively refined from knowledge (triangles).
Program Comprehension30 Soloway model Also known as domain model. Developers familiar with domain can understand the code in top-down process. Strategic plans: global strategy used Tactical plans: local strategies Implementation plans: code fragments that implement tactical plans Help decompose goals into plans. Top-down decomposition of goals into plans and into lower level plans.
Program Comprehension31 Pennington model Use beacons Chunking Use beacons
Program Comprehension32 Integrated model
Program Comprehension33 Distributed cognition l Traditional cognitive models deal the cognitive processes inside one person’s brain. l On real projects, software developers: Work in teams Can ask people questions Can surf the web for answers l How do these affect the cognitive process?
Program Comprehension34 Outline l Overview l Mental models l Cognitive models l Tools l Research topics
Program Comprehension35 Generic tool pipeline
Program Comprehension36 Code browser example: SeeSoft
Program Comprehension37 Outline l Overview l Mental models l Cognitive models l Tools l Research topics
Program Comprehension38 Research topics l Empirical studies of cognitive models Larger systems (multiple programmers) Modern architectures Borrow techniques from psychology Use of recognition and recall as dependent variables l Tool support for comprehension tasks Information foraging Automated suggestions for program investigation Text mining application to code exploration Source code analysis Support for newer languages
Program Comprehension39 Additional references l Susan Elliott Sim. Research in Program Comprehension (UC Irvine lecture notes). l Jonathan Maletic. Program Comprehension & The Psychology of Programming (Kent State University lecture notes). l Richard Upchurch. Program Comprehension. From Software Process Resource Collection. UMass – Dartmouth, sion.html. sion.html