No application is an island: Using topes to transform strings during data transfer Atipol Asavametha, Prashanth Ayyavu, Christopher Scaffidi School of.

Slides:



Advertisements
Similar presentations
Jeremy S. Bradbury, James R. Cordy, Juergen Dingel, Michel Wermelinger
Advertisements

The World Wide Web. 2 The Web is an infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that.
Information Retrieval in Practice
Topes: Reusable Abstractions for Validating Data Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Write basic.
Input Validation For Free Text Fields ADD Project Members: Hagar Offer & Ran Mor Academic Advisor: Dr Gera Weiss Technical Advisors: Raffi Lipkin & Nadav.
Topes: Enabling End-User Programmers to Validate and Reformat Data Christopher Scaffidi Key collaborators: Brad Myers, Mary Shaw Carnegie Mellon University.
Tool Support for Data Validation by End-User Programmers Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Satzinger, Jackson, and Burd Object-Orieneted Analysis & Design
Architectural Design Principles. Outline  Architectural level of design The design of the system in terms of components and connectors and their arrangements.
Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science,
Accommodating Data Heterogeneity in ULS Systems Christopher Scaffidi Mary Shaw Carnegie Mellon University.
Chapter 10: Architectural Design
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
C o n f i d e n t i a l Developed By Nitendra NextHome Subject Name: Data Structure Using C Title: Overview of Data Structure.
UNIT-V The MVC architecture and Struts Framework.
INTRODUCTION TO WEB DATABASE PROGRAMMING
Chapter 16 The World Wide Web. 2 Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Write basic HTML.
Chapter 10 Architectural Design
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Architecture Of ASP.NET. What is ASP?  Server-side scripting technology.  Files containing HTML and scripting code.  Access via HTTP requests.  Scripting.
A Scalable Application Architecture for composing News Portals on the Internet Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta Famagusta.
OpenAlea An OpenSource platform for plant modeling C. Pradal, S. Dufour-Kowalski, F. Boudon, C. Fournier, C. Godin.
Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.
K. Jamroendararasame*, T. Matsuzaki, T. Suzuki, and T. Tokuda Department of Computer Science, Tokyo Institute of Technology, JAPAN Two Generators of Secure.
Overview of Previous Lesson(s) Over View  ASP.NET Pages  Modular in nature and divided into the core sections  Page directives  Code Section  Page.
DATA COMMUNICATION DONE BY: ALVIN SAMPATH CARLVIN SAMPATH.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
T Network Application Frameworks and XML Web Services and WSDL Sasu Tarkoma Based on slides by Pekka Nikander.
Sadegh Aliakbary Sharif University of Technology Fall 2011.
An Introduction to Software Architecture
Assessing the Suitability of UML for Modeling Software Architectures Nenad Medvidovic Computer Science Department University of Southern California Los.
Topes: Meeting the Challenges of User Input Validation Christopher Scaffidi Key collaborators: Brad Myers, Mary Shaw Carnegie Mellon University.
Oct 14, 2001OOPSLA’01- DSVL1 Experiences with Visual Programming Languages for End-Users and Specific Domains Philip T. Cox Trevor J. Smedley Dalhousie.
Grid Computing Research Lab SUNY Binghamton 1 XCAT-C++: A High Performance Distributed CCA Framework Madhu Govindaraju.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Declaratively Producing Data Mash-ups Sudarshan Murthy 1, David Maier 2 1 Applied Research, Wipro Technologies 2 Department of Computer Science, Portland.
Overview of Form and Javascript fundamentals. Brief matching exercise 1. This is the software that allows a user to access and view HTML documents 2.
Design Patterns -- Omkar. Introduction  When do we use design patterns  Uses of design patterns  Classification of design patterns  Creational design.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
ESIP Semantic Web Products and Services ‘triples’ “tutorial” aka sausage making ESIP SW Cluster, Jan ed.
Salman Marvasti Sharif University of Technology Winter 2015.
12 Chapter 12: Advanced Topics in Object-Oriented Design Systems Analysis and Design in a Changing World, 3 rd Edition.
Dec. 13, 2002 WISE2002 Processing XML View Queries Including User-defined Foreign Functions on Relational Databases Yoshiharu Ishikawa Jun Kawada Hiroyuki.
Model Design using Hierarchical Web-Based Libraries F. Bernardi Pr. J.F. Santucci {bernardi, University of Corsica SPE Laboratory.
1 Year of Progress on Topes Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Semantic Web COMS 6135 Class Presentation Jian Pan Department of Computer Science Columbia University Web Enhanced Information Management.
CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden.
IN THIS LESSON WE WILL REVIEW THE STRUCTURE OF THE INTERNET AND HOW BROWSERS ASSEMBLE WEBSITES BASED ON INSTRUCTIONS THEY RECEIVE FROM SERVERS. Internet.
Connecting to External Data. Financial data can be obtained from a number of different data sources.
Sadegh Aliakbary Sharif University of Technology Fall 2010.
© 2010 IBM Corporation RESTFul Service Modelling in Rational Software Architect April, 2011.
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
System Software Laboratory Databases and the Grid by Paul Watson University of Newcastle Grid Computing: Making the Global Infrastructure a Reality June.
Defects of UML Yang Yichuan. For the Presentation Something you know Instead of lots of new stuff. Cases Instead of Concepts. Methodology instead of the.
Metadata Michael J. Watts
Getting Started with HTML
Information Retrieval in Practice
Multi-Device UI Development for Task-Continuous Cross-Channel Web Applications Enes Yigitbas, Thomas Kern, Patrick Urban, Stefan Sauer
Search Engine Architecture
CHAPTER 2 CREATING AN ARCHITECTURAL DESIGN.
A Data Model to Help End Users Shape Effective Software
An Introduction to Software Architecture
Serpil TOK, Zeki BAYRAM. Eastern MediterraneanUniversity Famagusta
Introduction of Week 11 Return assignment 9-1 Collect assignment 10-1
Chapter 16 The World Wide Web.
Subject Name: SOFTWARE ENGINEERING Subject Code:10IS51
Software Architecture & Design
Presentation transcript:

No application is an island: Using topes to transform strings during data transfer Atipol Asavametha, Prashanth Ayyavu, Christopher Scaffidi School of Electrical Engineering and Computer Science Oregon State University

2 Problem: Data heterogeneity among software components Software components –Created by autonomous stakeholders –Differing data formats –May switch to new formats without prior notice Programmers –Need to move data between elements automatically End users –Need to move data between elements manually problem  approach  evaluation

3 Example: Exchanging person names John Smith today Smith, John tomorrow – unexpected format! unanticipated need for “glue code” to reformat Lincolnshire MCC tomorrow – questionable! need to validate data, maybe trigger fail-over Similar issues for data from users, external datasets, or the web. problem  approach  evaluation

4 Other examples of data format heterogeneity Room Numbers –NSH 3103 vs Newell Simon Hall 3103 Stocks –GOOG vs Google vs Google Corporation Address Lines –101 Main St. vs 101 MAIN STREET vs 101 Main Str. Phone Numbers – vs vs (888) State Names –California vs CA vs Calif. problem  approach  evaluation

5 Insight: Exchange kinds of data (rather than particular formats) John Smith Main St. Pittsburgh, PA Doe, Jane Brooke Lane PITTSBURGH Pennsylvania RAY TILL (404) PITT ST PGH, Penna. MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA JOHN SMITH (303) MAIN ST Pittsburgh, PA problem  approach  evaluation

6 Insight: Exchange kinds of data (rather than particular formats) Three loci for reformatting… –Before transmitting (from source component) –After receiving (at receiving component) –Or along the way (in the connector itself) problem  approach  evaluation Could be a database, web site, XML web service, desktop application, …

7 Use topes to reformat! A tope = a platform-independent abstraction describing how to recognize and transform strings in one category of data Greek word for “place,” because each corresponds to a data category with a natural place in the problem domain Examples: –Tope for person name –Tope for university names (and abbreviations) –Tope for North American phone numbers –Tope for Oregon State University phone numbers problem  approach  evaluation

8 A tope is a graph. Node = format, edge = transformation Notional representation for an OSU room number tope… Formal building name & room number Kelley Engineering Center 1148 Colloquial building name & room number Kelley 1148 Building abbreviation & room number KEC 1148 problem  approach  evaluation

9 A tope is a conceptual abstraction. A tope implementation is code. Each tope implementation has executable functions: –1 isa:string  [0,1] function per format, for recognizing instances of the format (a fuzzy set) –0 or more trf:string  string functions linking formats, for transforming values from one format to another Validation function:  (str) = max(isa f (str)) where f ranges over tope’s formats –Valid when  (str) = 1 –Invalid when  (str) = 0 –Questionable when 0 <  (str) < 1 problem  approach  evaluation

10 But will it really work? For a range of different kinds of components, e.g…. Web service  application Application  web service Web site  web site Desktop application  web site … and other combinations? How to specify which tope functions to invoke? How much work will it be, in practice? problem  approach  evaluation

11 Case study propositions Most of the difficulties encountered will result from technologies other than topes. Topes will be able to perform the string transformations needed in a variety of situations. Topes will be useful at all three loci (before/during/after data transfer), though not necessarily in every combination of locus and architectural style. Using topes will simplify the code required to perform string transformations. problem  approach  evaluation

12 Case #1: Enhanced Windows clipboard problem  approach  evaluation

13 Case #2: Enhanced web macro tool go to “ enter “Prashanth Ayyavu” into the “Full name” textbox copy the “Full name” textbox go to “ paste in “DAVID JAMES” format from “person name” into the “your name” textbox (The CoScripter web macro tool already had copy/paste functionality; we just added the clauses for reformatting.) problem  approach  evaluation

14 Case #3: Web service library XML Jan-96 (203) /30/2007 TopeSheet xpath:/mydoc/whatever/date{tope:url( xpath:/mydoc/whatever/tel{tope:url( Client Code ItemLoader loader = ItemLoader.FromXml(xml); ItemSet items = loader.Load("xpath:/*/tel"); List values = items.FormatAs(" "); // overloaded methods let you override the topes and/or validate the data problem  approach  evaluation

15 Summary of findings 1. Clipboard2. Web macros3. Web services Main sources of difficulty Windows APIReading the CoScripter code; interfacing to our topes library Web services becoming unavailable Topes can handle the kinds of strings Yes Topes useful at all three loci ConnectorCoScripter component (acts as connector between websites) Sender or receiver of data Topes simplify reformatting code YesNo… needed interface code Yes problem  approach  evaluation

16Conclusion Software elements can use varying formats –No explicit references to format identifiers –No need for ontology consensus Topes are reusable for data in… XML nodes  Database tuples HTML tags  Webform fields Spreadsheet cells  …and more Main challenge is interfacing to library across languages problem  approach  evaluation

17 Thank You… To ICISA for this opportunity to participate