Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University.

Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University

Finding information in code is difficult Information Foraging Theory (IFT) describes two main costs that people incur when finding information in an information environment Navigation between “patches,” which for programmers can be thought of as methods connected by navigable links Processing information “features” within patches, such as the words in the source code of methods

Strategies to help programmers find information in code Help programmers navigate between patches By automatically generating useful links (e.g., Gilligan), by letting user ask questions (e.g., WhyLine), or by supporting new kinds of search (e.g., CodeFinder) Strategy has been extensively tested with professional and end-user programmers Help programmers understand patches By highlighting code similarities (e.g., Jigsaw), or by summarizing important statements and eliding unimportant statements (e.g., Sridhara’s tool) Strategy has been extensively tested with professional programmers A similar strategy, narrowly, has been used to help EUPs to find bugs

Tools based on the latter strategy apply heuristics to identify important statements For example, Sridhara’s algorithm: Given a particular method… Classify statements based on whether they… Only invoke void-return methods Invoke methods with names similar to current method Return from the current method Facilitate data flow in particular ways among above statements Perform control flow (e.g., if, loop) in particular ways of statements above Identify statements as “important” If they meet the criteria above And they are not in exception handler or variable initializer And (for control flow statements) they do not have an empty “else” block

But does increasing relative visual weight of important information features help EUPs? Perhaps no Write-only code: EUPs’ code is often written to be used once Nardi: “It is not clear whether users who modify existing example programs could ever really come to understand the programs they modify” Perhaps yes EUPs often have less training than professional programmers Maybe people who have low levels of training would benefit even more than professional programmers

Two prototypes: Each manipulates relative visual weight in a different way Both prototypes identify “important” statements Sridhara’s algorithm, applied to TouchDevelop, with no interesting changes Prototype #1: Highlight important statements Different colors depending on statement type Prototype #2: Hide unimportant statements Display just line number, for toggling hide/unhide

Study to evaluate impact of these prototypes Between-subject design, with three treatments #1: Using prototype 1 (highlighting important) #2: Using prototype 2 (hiding unimportant) #3: Using baseline (standard TouchDevelop tool) Key comparisons are #1 vs #3 and #2 vs #3 To understand the impact of each method of manipulating relative visual weight, independently, versus the baseline The goal is not to compete the prototypes against one another

Task: Find information in 3 different programs Programs varied in size To help clarify if any impacts were due to between-patch navigation Ecological validity Size: a bit on long side (earlier study: 75% programs <= 100 LOC) Purpose: quite typical (33% were utilities, 11% games, 7% images) ProgramLines of codeNumber of methodsProgram purpose Small381Guessing game Medium1225Setting a timer Large32624Editing an image

Analyzing 3 measures of impact Created program comprehension questions based on our earlier study of EUPs’ historical TouchDevelop maintenance tasks Approximately 1 question for every 50 lines of code, 6 total Measures of impact (per program) Questions correct, time taken (sec), efficiency (ratio of these scaled times 1000) Kruskal-Wallis nonparametric statistical test with a two-tailed ANOVA of each measure rank versus two factors: treatment and program Robust to outliers, small samples, and non-normal distributions So we can see impact of the treatment (tool) while controlling for program size

Data from 30 pre-CS students Very early in their training (taking freshman or sophomore CS courses) Inclusion criteria Understand English, adults, and never professional programmers Had to drop data from 1 of the 31 participants (who didn’t know English)

Results: Highlighting helped, but hiding harmed; all effects were consistent across program size MeasureHighlight P value Direction & F score Score0.01+7.35 Time0.70 — 0.12 Efficiency0.03+5.06 MeasureHide P value Direction & F score Score0.08 — 3.15 Time0.28+1.14 Efficiency0.04 — 4.33

Further study could address threats to validity What if the algorithm doesn’t find truly important statements? In the case of this experiment, we manually verified that it did find all the statements needed for answering the questions. But, supposing some situation where it missed truly important statements… Highlighting might not help as much… further study needed Hiding probably would hurt even more (due to hiding truly important statements) Would these results generalize to EUPs “in the wild”? In the case of our experiment, participants were novices and EUPs No study has really looked at novices vs EUPs… further study needed Tasks were based on typical TouchDevelop tasks But didn’t study effects on tasks that use large data structures… further study needed

Conclusions and implications for theory Highlighting important helped, but hiding unimportant information hurt The “unimportant” statements apparently contributed to obtaining value from the important statements Value of important statements was contingent on unimportant statements IFT currently lacks any notion of contingent information feature value The optimal foraging path might not be the one that goes straight to the needed information. Sometimes, optimal != shortest.

Conclusions and implications for tool-building For coding: Highlighting important statements might help EUPs to understand their own code better and avoid creating bugs For version control: Highlighting important statements in “diffs” might help EUPs to better understand key differences between versions For debugging: Highlighting upcoming or past important statements during step-wise execution might aid understanding and finding bugs For reuse: Highlighting important statements in existing code might help EUPs understand its purpose and aid in deciding whether to reuse

Thank you… To you for your attention, interest, and ideas To the VL/HCC reviewers for your compliments and suggestions To Microsoft for lending TouchDevelop-equipped phones To the National Science Foundation for funding

Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University.

Similar presentations

Presentation on theme: "Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University.

Similar presentations

Presentation on theme: "Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University."— Presentation transcript:

Similar presentations

About project

Feedback