Code You Can Use: Searching for web automation scripts based on reusability James Admire, Abbas Al Zawwad, Abdulwahab Almorebah, Sanchit Karve, Christopher.

Slides:

Advertisements

Similar presentations

Recommender Systems & Collaborative Filtering

Advertisements

Configuration management

Software change management

Configuration management

Automated Software Testing: Test Execution and Review Amritha Muralidharan (axm16u)

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

© 2005 by Prentice Hall Appendix 2 Automated Tools for Systems Development Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F.

Recommender Systems Aalap Kohojkar Yang Liu Zhan Shi March 31, 2008.

Quality is about testing early and testing often Joe Apuzzo, Ngozi Nwana, Sweety Varghese Student/Faculty Research Day CSIS Pace University May 6th, 2005.

A Qualitative Study of Animation Programming in the Wild Aniket Dahotre, Yan Zhang, Christopher Scaffidi ESEM 2010.

Personality, 9e Jerry M. Burger

SE 555 Software Requirements & Specification 1 SE 555 Software Requirements & Specification Prototyping.

Maryam Karimzadehgan (U. Illinois Urbana-Champaign)*, Ryen White (MSR), Matthew Richardson (MSR) Presented by Ryen White Microsoft Research * MSR Intern,

17-Oct-02http://tmf.gg.uwyo.edu/ The Educational Object Economy: What is it? How do we make it a reality in the Earth sciences? James D. Myers Department.

User Experience Design Goes Agile in Lean Transformation – A Case Study (2012 Agile Conference) Minna Isomursu, Andrey Sirotkin (VTT Technical Research.

© 2005 by Prentice Hall Appendix 2 Automated Tools for Systems Development Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F.

User Centered Design April 1-3, 2009 Joshua Ganderson Laura Baalman Jay Trimble.

Comparative Evaluation of the Impact of e- participation in Local Climate Change Policy Programs The Effectiveness of E-Participation.

5 th Dec 2005 RAATE 2005 The OATS Project ACE Centre Advisory Trust: Andrew Lysley, Jason Walsh, Stephen Druce Access to Communication and Technology (ACT),

Rapid Prototyping Model

Appendix 2 Automated Tools for Systems Development © 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 2 Slide 1.

Where Innovation Is Tradition SYST699 – Spec Innovations Innoslate™ System Engineering Management Software Tool Test & Analysis.

Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.

Digging for diamonds: Identifying valuable web automation programs in repositories Jarrod Jackson 1, Chris Scaffidi 2, Katie Stolee 2 1 Oregon State University.

 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

 To explain the importance of software configuration management (CM)  To describe key CM activities namely CM planning, change management, version management.

EMI INFSO-RI SA2 - Quality Assurance Alberto Aimar (CERN) SA2 Leader EMI First EC Review 22 June 2011, Brussels.

Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.

ZLOT Prototype Assessment John Carlo Bertot Associate Professor School of Information Studies Florida State University.

Group Technical Assistance Webinar August 5, CFPHE RESEARCH METHODS FOR COMPARATIVE EFFECTIVENESS RESEARCH.

Management & Development of Complex Projects Course Code MS Project Management Perform Qualitative Risk Analysis Lecture # 25.

Towards aiding within-patch information foraging by end-user programmers Balaji Athreya, Chris Scaffidi Oregon State University.

Instructor: Safaa S. Y. Dalloul E-Marketing Unit 9.

Towards Mining Informal Online Data to Guide Component-Reuse Decisions Sanchit Karve 1, Chris Scaffidi 2 1 McAfee Software (formerly Oregon State University)

Jack DeWeese Computer Systems Research Lab. Purpose  Originally intended to create my own simulation with easily modified variables  Halfway through.

Disciplined Software Engineering Lecture #3 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Sponsored by the U.S. Department.

Disciplined Software Engineering Lecture #2 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Sponsored by the U.S. Department.

Copyright © 1994 Carnegie Mellon University Disciplined Software Engineering - Lecture 1 1 Disciplined Software Engineering Lecture #2 Software Engineering.

PIER Research Methods Protocol Analysis Module Hua Ai Language Technologies Institute/ PSLC.

How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.

Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.

TOPIC 8 SYSTEM DEVELOMENTS, IMPLEMENTATION, MANAGEMENT AND APPLICATIONS CONTENT : 8.1 METHODOLIGIES AND SOFTWARE TOOLS FOR SYSTEM DEVELOPEMTN 8.2 APPLICATION.

What Training is Needed by Practicing Engineers Who Create Cyberphysical Systems? Chris Scaffidi Oregon State University (USA)

Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.

Working Memory and Learning Underlying Website Structure

28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.

CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –

WERST – Methodology Group

Google News Personalization Big Data reading group November 12, 2007 Presented by Babu Pillai.

1 Running Experiments for Your Term Projects Dana S. Nau CMSC 722, AI Planning University of Maryland Lecture slides for Automated Planning: Theory and.

Predicting Reuse of End-User Web Macro Scripts Chris Scaffidi 1  2, Chris Bogart 2, Margaret Burnett 2, Allen Cypher 3, Brad Myers 1, Mary Shaw 1 1 Carnegie.

Post-Ranking query suggestion by diversifying search Chao Wang.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

A code-centric cluster-based approach for searching online support forums for programmers Christopher Scaffidi, Christopher Chambers, Sheela Surisetty.

Are you looking for an opportunity to join a company that has a long history and an exciting future? A place where you can grow within an international.

Personalization Services in CADAL Zhang yin Zhuang Yuting Wu Jiangqin College of Computer Science, Zhejiang University November 19,2006.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Hunter Trainers & Assessors Network (HTAN) Facilitator HTAN Recognition Network – Sue Flindell.

Introduction to Software Engineering Muhammad Nasir Agile Software Development(2)

Project VIABLE - Direct Behavior Rating: Evaluating Behaviors with Positive and Negative Definitions Rose Jaffery 1, Albee T. Ongusco 3, Amy M. Briesch.

Recommender Systems & Collaborative Filtering

Christopher Scaffidi Center for Applied Systems and Software

Component Based Software Engineering

Effective way to build test Automation strategy in Agile

Title: Validating a theoretical framework for describing computer programming processes 29 November 2017.

SVTRAININGS. SVTRAININGS Python Overview  Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed.

Software engineering -1

Introduction of KNS55 Platform

A method for making results data more searchable and usable

E Pluribus Unum for end-user programming

Presentation transcript:

Code You Can Use: Searching for web automation scripts based on reusability James Admire, Abbas Al Zawwad, Abdulwahab Almorebah, Sanchit Karve, Christopher Scaffidi Oregon State University

Online repositories of reusable EUP code offer many ways to find relevant code Keyword-based search Type keywords, receive a search result list of existing code available to reuse Browsing by category E.g., based on thematic categories or tags “Related” code E.g., by listing other code derived from a given piece of code

Finding high-quality EUP code to reuse is hard Download counters and similar auto-generated popularity counts But hardly any code is ever downloaded more than a trivial number of times Explicit user-generated ratings of quality But most code is never rated, certainly not by more than a few people Curated collections of “featured” code But scalability and sustainability are perennial challenges for curators

CoScripter web macro repository as a microcosm Was one of the biggest repositories of web macros Web macro = EUP script for automating browser interactions with web sites > 6000 web macros when I last saw this repository Prior studies showed hardly any macros were reused much 9% run by 3 or more people 7% run at least 6 times per user Ultimately: Discontinued by IBM Sustainability is a challenge!!! 5% customized by any other user 4% copied by any other user

Prior work has shown it possible to predict which web macros would be reused But suppose a repository could predict from the moment of a macro’s creation whether it would be reused, so the search engine could emphasize or downplay the macro accordingly Prior work Collected 35 features of macros that seemed plausibly related to the understandability and modifiability of the macros, plus measures of reuse Trained machine learning models to predict which macros would be reused (train a unique model for each measure of reuse) Result: True positives of up to 90% at false positive rates in 10%-40% range Similar results when replicated with two other repositories of EUPs’ code

Key limitations of that prior work Predicted reuse, not reusability: Users might reuse enticing but low- quality code and then regret it. Sometimes, reuse != reusability. Predicted binary measures: We would need to estimate level of reusability for sorting, not merely whether it will or will not be reused. Relied only on data available at macro creation: Data such as user- generated ratings might help inform reusability estimates. Provided no search engine: A proof of concept implementation would help to clarify any remaining technical hurdles.

Goal: An approach for modeling reusability of EUPs’ code, for use in sorting search results 1. Start with an existing repository that EUPs have used for a while 2. Define and compute features for EUPs’ macros in the repository 3. Reduce the feature set with factor analysis 4. Construct a model of reusability by linear regression of an expert user’s estimate of macro reusability versus the computed features 5. Sort macros by estimated reusability (at least in part) in search engine 6. Evaluate reusability estimates with another panel of experts as they use the search engine, and iterate the model in the search engine

Step 1: Getting a repository of EUPs’ code CoScripter Already had been in operation for approximately 5 years (since early 2008) Already well-familiar with the repository due to our prior work Already had a well-developed list of candidate features due to prior work Already had permission to scrape macros and other data from the repository

Step 2: Defining features for macros Selected 8 features from the 35 investigated in prior work Statistically associated with reuse in both prior studies Could be computed directly and automatically from available data E.g., # comments, # parameters, 1 or 0 indicating if macro has a title Created 21 features as refinements of the 35 from prior work Macro age, and 20 different counts of code length Created 8 new features based on new data suggesting user interest Not previously considered, as these data accrue after macro creation E.g., # times run, # users who ran it, # revisions, # comments about it

Step 3: Reducing features data with factor analysis Factor = linear combination of features that are mutually correlated Procedure Randomly selected 100 macros Computed our 37 features for each macro Performed factor analysis Discard all but the most salient factors (optimal coordinates method) Result: 8 factors containing 17 features Most of these retained features were related to code size, comments, and numbers of runs (e.g., total count or normalized by number of users)

Step 4: Constructing a model of reusability Linear regression of reusability estimates versus factors From Step 3, we could compute 8 factor scores as linear combinations of features But just because factors exist doesn’t mean they are actually related to reusability! So: Linear regression w/ dep var = reusability estimate, 8 indep vars = factor scores Procedure One team member (who did not help with defining or computing features) gave reusability estimate (range 1-4) to each of the 100 web macros Result: Linear model that estimates reusability based on the features Linear regression was highly significant (P=0.003) 7 out of 8 factors had non-zero coefficients

Step 5: Searching for code based on reusability Code You Can Use (CYCU) (pronounced “cuckoo”) Compute reusability estimates offline When user enters query, forward query to CoScripter repository, get back a list of macros, look up reusability estimates, and sort by estimated reusability Keywords Search results (offline)

Step 6: Evaluating reusability estimates with another panel of expert users Using a different set of users than the one who gave initial estimates Needed users who were pretty good at programming but who could approach CoScripter as an EUP tool rather than as a professional programming tool 4 CS students, only one of whom had any experience as a professional programmer (<2 years), but all of whom were seniors or master’s Using a different set of macros than those used to create the model Manually reviewed CoScripter repository to see what was popular lately Identified two themes: searching for houses and checking for flight information Each of the 4 participants rated 20 of the 40 test macros i.e., 2 participants rated each test macro

We collected 2 user-assessed reusability measures and 1 user-assessed relevance measure Randomly ordered the macros and asked participants to rate (on a 4-point Likert scale)… How helpful is this code in learning CoScripter? How easy is it to understand the code? How relevant is the code to the search term ‘search for houses’ [or ‘check airlines’]? We expected that our reusability estimates… Would significantly correlate with learnability and understandability ratings Would not significantly correlate with relevance ratings

Result: Significant correlations appeared on all three measures MeasurePF scoreAdj. R 2 Learnability< Understandability< Relevance Regression of each measure for each macro (averaged over participants) against reusability estimate. Note: Analysis utilized data for only 39 macros… one participant chose to skip a macro.

Further work could address threats to validity and limitations of this study Different kinds of macros require different models of reusability Indeed, our prior work showed different kinds of scripts require somewhat different features. But the overall approach (compute features, combine features, validate) should methodologically generalize at least across textual scripting languages. More sophisticated methods might be better for sorting search results based on integrating relevance with reusability estimates Tool-builders might find this approach more onerous than we did We built on hundreds of hours of our own prior work Crucial work remains on overcoming barriers to tech transfer of EUP research

Exciting opportunities now exist for moving quality-based code search toward practice Key contributions New approach for modeling reusability of EUPs’ scripts Demonstration of how such a model can be used in a search engine Next steps Elucidating and countering risks of users “gaming” the system by artificially boosting the apparent reusability of their code Begin integrating reusability models into other, more sophisticated browsing and search methods (e.g., collaborative filtering or other search tools) Investigating the impacts of applying this approach on day-to-day practice with a larger repository (e.g., impacts on learning by Scratch users) Working with industry partners to apply this approach in their own repositories

Thank you To you for your attention, interest, and ideas To the VL/HCC reviewers for your compliments and suggestions To IBM for permission to scrape the CoScripter repository To the National Science Foundation for funding

CYCU Screenshots