Presentation is loading. Please wait.

Presentation is loading. Please wait.

Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science,

Similar presentations


Presentation on theme: "Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science,"— Presentation transcript:

1 Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science, http://www.cs.cmu.edu/~cscaffid Problem How can we enable EUPs to implement input-validation code? End-user Programmers (EUPs) Millions of end users are also programmers who create spreadsheets or web forms containing… Company namessuch asMicrosoft Room numbersWean Hall 4104 Campus phone numbers8-3564 Project numbers004.000.270999.99 Grant numbersCCF-0613823 Email addressescscaffid@cs.cmu.edu Pilot Study In their own words, 4 administrative assistants described how to recognize American mailing addresses and university project numbers. They almost always described data as a hierarchy of named parts, such as describing a mailing address as a street address, city, state, and zip. This structurally resembled a context-free grammar (CFG), down until sub-parts were so small that participants lacked names for them. At that point, participants used soft constraints to define sub-parts, such as saying that the street type usually is “Ave” or “St”, indicating that valid data occasionally violate these constraints. This stands in stark contrast to regexps and CFGs, which classify inputs as valid or invalid, with no shades of gray. … and other kinds of inputs that are… Short (usually in 1 spreadsheet cell or web form textfield) Often ambiguously defined (a “valid” company name) Often organization-specific (your validation rules may differ from mine!) Sometimes application-specific Prototype Based on pilot results, we designed a tool called Toped for implementing validation “formats”. Each format consists of named parts with constraints that can often or always be true. Toped accepts a set of examples, then infers a boilerplate format for EUPs to review and customize (Fig A). To support iterative refinement, a window allows EUPs to enter test strings. Toped converts the format to a CFG with constraints attached to the productions, then checks the strings against this constrained CFG. Toped’s integration with Microsoft Excel and Visual Studio (web form design tool) enable reuse of formats for validating spreadsheet and web form data. Our system identifies inputs that violate the CFG or constraints, then displays a human-readable message summarizing errors (Fig B). Users can override warnings in spreadsheets, as well as soft constraint violations in web forms. Fig A: Editing a format in Toped Evaluation: Usability Study 16 EUPs implemented validation to find typos in 3 kinds of data—phone numbers, street addresses, and company names. We randomly assigned them to use Toped or a comparison tool (Lapis). Toped EUPs completed more tasks (2.79 of 3, vs 1.75), found more typos (92% of typos, vs 32%), were more accurate overall (F 1.74 vs.51), and were more satisfied with the tool (satisfaction question-naire scale score 3.78 ≈ “somewhat satisfied” vs 3.00 = “Neutral”). These differences were significant at P<0.01, except for accuracy (F 1 ). Also, Toped EUPs were faster and more accurate at our tasks than EUPs doing similar tasks in an earlier study that evaluated a regexp editor. Future Work Our evaluation only involved 3 formats, and EUPs might struggle to implement formats for other data. We will develop a repository where EUPs can publish and share formats, enabling us to collect formats and feedback from EUPs using formats in real applications. Fig B: Human-readable descriptions of input errors Funded by EUSES under ITR-0325273, and by NSF under CCF-0438929 and CCF-0613823. 2134 Is it right?


Download ppt "Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science,"

Similar presentations


Ads by Google