Download presentation
Presentation is loading. Please wait.
1
Tool Support for Data Validation by End-User Programmers Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University
2
2 Target audience: End-user programmers In 2012, in American workplaces –90 million computer end users –55 million of whom will create Spreadsheets Databases Web applications Introduction Topes Demonstration Conclusion
3
3 An input validation problem observed during contextual inquiry Valid? “EDSH 225” Questionable? “EDXH 225” Valid but wrong format? “Smith 225” Or obviously invalid? “Robotics Institute” Introduction Topes Demonstration Conclusion
4
4 Underlying problem: abstraction mismatch Tools support strings, ints, floats, sometimes dates. Problem domain involves higher-level categories: –University names –Person names –CMU phone numbers –CMU room numbers These data categories are: –Short human-readable strings –Multi-format –Sometimes ambiguous (non-binary scale of validity) –Often particular to certain groups of people Introduction Topes Demonstration Conclusion
5
5 Limitations of existing approaches Types do not support questionable values Grammars do not, either, nor can they reformat Information extraction algorithms rely on grammatical cues that are absent during validation Cues, Forms/3, -calculus, Slate, pollution markers, etc, infer numerical constraints but not constraints on strings, nor are they platform-independent Introduction Topes Demonstration Conclusion
6
6 New Approach: Topes A tope = a platform-independent abstraction describing how to recognize and transform strings in one category of data Greek word for “place,” because each corresponds to a data category with a natural place in the problem domain Introduction Topes Demonstration Conclusion
7
7 A tope is a graph. Node = format, edge = transformation Notional representation for a CMU room number tope… Formal building name & room number Elliot Dunlap Smith Hall 225 Colloquial building name & room number Smith 225 Building abbreviation & room number EDSH 225 Introduction Topes Demonstration Conclusion
8
8 A tope is a conceptual abstraction. A tope implementation is code. Each tope implementation has executable functions: –1 isa:string [0,1] function per format, for recognizing instances of the format (a fuzzy set) –0 or more trf:string string functions linking formats, for transforming values from one format to another Validation function: (str) = max(isa f (str)) where f ranges over tope’s formats –Valid when (str) = 1 –Invalid when (str) = 0 –Questionable when 0 < (str) < 1 Introduction Topes Demonstration Conclusion
9
9 Today’s demonstration (using our latest version) Create phone number tope –Infer boilerplate from examples –What are formats, parts, and constraints? –Label parts, add/fix constraints, test in tool –Validate spreadsheet data –Transform spreadsheet data Reuse phone number tope –Create web application –Attach tope-based validator, configure, execute –Valid / invalid / questionable / valid-but-misformatted Introduction Topes Demonstration Conclusion
10
10 Contributions highlighted today A model for data... –Short, human-readable strings –Ambiguous categories –Multiple formats Implementation features: –Inference of customizable formats from examples –Soft constraints –Human-readable error messages –Validation code is reusable across platforms Introduction Topes Demonstration Conclusion
11
11 Other contributions not highlighted today Validating with topes (quantitatively) improves… –Accuracy of validation –Reusability of validation code –Subsequent duplicate identification Additional tool features: –Inter-tope reference (ie: “topes in topes”) –Whitelists –Various additional auto-transformation features –Overriding auto-transformation with JavaScript Introduction Topes Demonstration Conclusion
12
12 Validation and Tool Maturity Expressiveness –Have implemented dozens of topes Usability –Fast creation of accurate formats by users in study Usefulness –Integrated w/ Excel, Visual Studio, and an XML library –Integrated by IBM & Univ. Nebraska into other tools Introduction Topes Demonstration Conclusion
13
13 Thank You… To Jeff Magee, Betty Cheng, Barbara Ryder, Margaret Burnett, and others at ICSE 2007 for early feedback To NSF for funding To ICSE 2008 for this opportunity to present Introduction Topes Demonstration Conclusion
14
14 Available for download http://www.cs.cmu.edu/~cscaffid/software.shtml Or Google for "Topes SDK" Introduction Topes Demonstration Conclusion
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.