Accommodating Data Heterogeneity in ULS Systems Christopher Scaffidi Mary Shaw Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Web Services & EAI.
Advertisements

Tutorial 6 Creating a Web Form
HTML 5 and CSS 3, Illustrated Complete Unit L: Programming Web Pages with JavaScript.
With Microsoft Access 2010© 2011 Pearson Education, Inc. Publishing as Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Access.
CHAPTER 30 THE HTML 5 FORMS PROCESSING. LEARNING OBJECTIVES What the three form elements are How to use the HTML 5 tag to specify a list of words’ form.
Challenges, Motivations, and Success Factors in the Creation of Hurricane Katrina "Person Locator" Web Sites Christopher Scaffidi, Brad Myers, Mary Shaw.
Carving up the Space of End User Programming EUSES, Lincoln, NE, Oct ‘05.
Fast, Accurate Creation of Data Validation Formats by End-User Developers Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Croatia Calling By Laura Croucher. Croatia Calling Group.
Using COS Funding Opportunities the world’s largest funding information database ™
Topes: Reusable Abstractions for Validating Data Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Lecture Microsoft Access and Relational Database Basics.
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Write basic.
1 COS 425: Database and Information Management Systems XML and information exchange.
Topes: Enabling End-User Programmers to Validate and Reformat Data Christopher Scaffidi Key collaborators: Brad Myers, Mary Shaw Carnegie Mellon University.
Tool Support for Data Validation by End-User Programmers Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Toped: Enabling End-User Programmers to Validate Data Chris Scaffidi, Brad Myers, Mary Shaw, Carnegie Mellon University, School of Computer Science,
Exchanging Environmental Data for the Gulf of Maine 2007 Exchange Network Collaboration Grant Prep webinar for development meeting October 28th-29th, 2008.
Stanford University EH&S A Service Oriented Architecture For Rich Internet Applications Sheldon M. Heitz.
A Lightweight Model for End Users’ Domain-Specific Data Christopher Scaffidi Carnegie Mellon University VL/HCC Graduate Consortium 2006.
2/22/00J. Alberto Espinosa -- CMU/GSIA MIS Dynamic HTML Using Active Server Pages (ASP) Alberto Espinosa MIS
A Data Model to Help End User Programmers Manipulate and Validate Data Christopher Scaffidi Carnegie Mellon University ISRI SSSG Oct 2006.
UNIT-V The MVC architecture and Struts Framework.
JSP Standard Tag Library
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Chapter 16 The World Wide Web Chapter Goals ( ) Compare and contrast the Internet and the World Wide Web Describe general Web processing.
Chapter 16 The World Wide Web Chapter Goals Compare and contrast the Internet and the World Wide Web Describe general Web processing Describe several.
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
CPS120: Introduction to Computer Science The World Wide Web Nell Dale John Lewis.
No application is an island: Using topes to transform strings during data transfer Atipol Asavametha, Prashanth Ayyavu, Christopher Scaffidi School of.
XML BIS4430 – unit 10. XML Origins Extensible Markup Language (XML) 1998 Inspired by Standard Generalized Markup Language (SGML) and HTML. SGML defines.
Topes: Meeting the Challenges of User Input Validation Christopher Scaffidi Key collaborators: Brad Myers, Mary Shaw Carnegie Mellon University.
Conceptual Data Modeling. What Is a Conceptual Data Model? A detailed model that shows the overall structure of organizational data A detailed model.
Development of metadata in the National Statistical Institute of Spain Work Session on Statistical Metadata Genève, 6-8 May-2013 Ana Isabel Sánchez-Luengo.
Chapter 16 The World Wide Web. 2 The Web is an infrastructure of distributed information combined with software that uses networks as a vehicle to exchange.
Implementing ERMs: Opportunities and Challenges Jeff Campbell, Systems Librarian, UNC Chapel Hill Rebecca Kemp, Serials Supervisor, UNC Wilmington 2007.
ASP.NET.. ASP.NET Environment ASP.NET is Microsoft's programming framework that enables the development of Web applications and services. It is an easy.
CYBORG Domain Independent Distributed Database Retrieval System Alok Khemka Kapil Assudani Kedar Fondekar Rahul Nabar.
Creating Dynamic Web Pages Using PHP and MySQL CS 320.
Welcome to Creative Web Design You will soon be creating your first web page.
Intelligently Creating and Recommending Reusable Reformatting Rules Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
Triggers and Stored Procedures in DB 1. Objectives Learn what triggers and stored procedures are Learn the benefits of using them Learn how DB2 implements.
Cohesion and Coupling CS 4311
Of 41 lecture 4: rdf – basics and language. of 41 RDF basic ideas the fundamental concepts of RDF  resources  properties  statements ece 720, winter.
Nell Dale & John Lewis (adaptation by Michael Goldwasser) The World Wide Web.
AIXM 5 Metadata. Requirements for AIXM Metadata AIXM Metadata Model Examples Requirements for AIXM Metadata AIXM Metadata Model Examples.
Advanced topics in touchdevelop touchdevelop vs. apps with Visual Studio comparison Disclaimer: This document is provided “as-is”. Information and views.
Telerik Analytics What kind of Analytics is Application Analytics?  November 6 th, 2015  Eigil Rosager Poulsen – Telerik 
WML & WML Script Presented by Kelvin Liu 01/06/2000.
Introduction to HTML. _______________________________________________________________________________________________________________ 2 Outline Key issues.
Microsoft Visual Studio 2005 Tools for the Office System: Building Office Solutions Using Visual Studio 2005 Tools for Office Andrew Coates Developer Evangelist.
1 Year of Progress on Topes Christopher Scaffidi Brad Myers, Mary Shaw Carnegie Mellon University.
CoScripter and Topes: Putting Data into Usable Formats Christopher Scaffidi Carnegie Mellon University With Allen Cypher and Jimmy Lin IBM Almaden.
IN THIS LESSON WE WILL REVIEW THE STRUCTURE OF THE INTERNET AND HOW BROWSERS ASSEMBLE WEBSITES BASED ON INSTRUCTIONS THEY RECEIVE FROM SERVERS. Internet.
ALLEGHENY COUNTY DEPARTMENT OF HUMAN SERVICES You live where? Address and geocoding woes Catherine, Amy, Melinda.
Hydroinformatics Lecture 15: HydroServer and HydroServer Lite The CUAHSI HIS is Supported by NSF Grant# EAR CUAHSI HIS Sharing hydrologic data.
Developing our Metadata: Technical Considerations & Approach Ray Plante NIST 4/14/16 NMI Registry Workshop BIPM, Paris 1 …don’t worry ;-) or How we concentrate.
SchemaLogic Workshop Tools for Enterprise Metadata Management and Synchronization Prepared for the University of Washington Information School Applied.
XML Databases Presented By: Pardeep MT15042 Anurag Goel MT15006.
Metadata Michael J. Watts
CUAHSI HIS Sharing hydrologic data
Database Driven Websites
A Data Model to Help End Users Shape Effective Software
Stack Lesson xx   This module shows you the basic elements of a type of linked list called a stack.
Data Tables and Arrays.
Web Design and Development
Chapter 16 The World Wide Web.
European Statistical System Metadata Handler ESS MH (Super) Providers
Sample Proofs 1. S>-M A 2. -S>-M A -M GOAL.
Presentation transcript:

Accommodating Data Heterogeneity in ULS Systems Christopher Scaffidi Mary Shaw Carnegie Mellon University

2 Problem: Data heterogeneity among software elements in ULS systems Software elements: –Created by autonomous stakeholders –Differing data formats –May switch to new formats without prior notice End-user programmers: –Create particularly unreliable software elements –“Mash up” (integrate) software elements problem  approach  proof-of-concept

3 Example: Exchanging person names John Smith today Smith, John tomorrow – unexpected format! unanticipated need for “glue code” to reformat Lincolnshire MCC tomorrow – questionable! need to validate data, maybe trigger fail-over Similar issues for data from users, external datasets, or the web. problem  approach  proof-of-concept

4 Other examples of data format heterogeneity Room Numbers –NSH 3103 vs Newell Simon Hall 3103 Stocks –GOOG vs Google vs Google Corporation Address Lines –101 Main St. vs 101 MAIN STREET vs 101 Main Str. Phone Numbers – vs vs (888) State Names –California vs CA vs Calif. problem  approach  proof-of-concept

5 Insight: Exchange kinds of data (rather than particular formats) John Smith Main St. Pittsburgh, PA Doe, Jane Brooke Lane PITTSBURGH Pennsylvania RAY TILL (404) PITT ST PGH, Penna. MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA MR. ART COR RED RUN RD. pittsburgh PA JOHN SMITH (303) MAIN ST Pittsburgh, PA problem  approach  proof-of-concept

6 Insight: Exchange kinds of data (rather than particular formats) Needed: Metadata indicating a reusable abstraction for validating and reformatting each kind of string-like data. –“I am sending you a string that I call a ‘phone number’, and here’s the code to validate it and reformat it” problem  approach  proof-of-concept

7 Proof of concept: Exchanging XML and HTML Data providers label XML/HTML nodes with a “tope” –“This node is what I call a ‘phone number’, and here’s where you can find code to validate and reformat it.” Each tope’s implementation is stored at a published URL On receiving data, a system –Downloads the tope implementation –Executes it to validate and put data into desired format problem  approach  proof-of-concept

8 Sample code XML Jan-96 (203) /30/2007 TopeSheet xpath:/mydoc/whatever/date{tope:url( xpath:/mydoc/whatever/tel{tope:url( Client Code ItemLoader loader = ItemLoader.FromXml(xml); ItemSet items = loader.Load("xpath:/*/tel"); List values = items.FormatAs(" "); // overloaded methods let you override the topes and/or validate the data problem  approach  proof-of-concept

9 Benefits of labeling strings with topes Systems can detect invalid inputs Software elements can use varying formats –No explicit references to format identifiers –No need for ontology consensus Topes are reusable for data in… –XML nodes  Database tuples –HTML tags  Webform fields –Spreadsheet cells  …and more problem  approach  proof-of-concept

10 Thank You… To Jeff Magee, Betty Cheng, Barbara Ryder, Margaret Burnett, and others at ICSE 2007 for early feedback To NSF for funding To ULSSIS for this opportunity to participate