Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis.

Slides:



Advertisements
Similar presentations
Tumenta P N. Leiden University,The Netherlands &
Advertisements

EER to Relation Models Mapping
Texas Conference of Urban Counties TIJIS Update TCJUIG Conference May 4, 2011 Tarrant County 1/8/20141.
Previous training sessions by our Consultants 1/9/20141 Presented by Achieve Strategies International.
1/13/20141 What is SimNet? LEARNING & ASSESSMENT MODULES FOR… Office 2010 | Windows Vista & IE7,8,9 | Windows XP, Vista & 7 | Computer Concepts In a simulated.
Monday, January 13, Instructor Development Unit 1 Instructional Responsibilities Ed Humphrey.
Monday, January 13, Instructor Development Strand 7 / Lesson 8.
Monday, January 13, Instructor Development Unit 1 Instructional Responsibilities Ed Humphrey.
Monday, January 13, Knowledge of Adult Learners Lesson 2 Rose DeJarnette.
Monday, January 13, Instructor Development Lesson 9.
Monday, January 13, Instructor Development Lesson 6 Instructor Resources.
Rev Monday, January 13, Foundations, Technology, Skills Tools.
1/15/ A car starts from rest and travel 10s with uniform acceleration 5m/s 2. Find out its final velocity. Here, u=0 m/s t=10s a= 5m/s 2 v=? We.
Dr. Peter OReilly Chairperson- ISM Services Group /23/20141 NAPM-AZ Presentation- March 2009.
Instituto Politécnico do Porto Escola Superior de Tecnologia e Gestão de Felgueiras Mestrado de Redes e Segurança de Computadores A. Paulo.
Hoai-Viet To1, Ryutaro Ichise2, and Hoai-Bac Le1
Polymorphic System Architecture By Jeff Bryson Software Engineer Staff Lockheed Martin Simulation, Training, & Support 2/7/20141 Lockheed Martin Simulation,
The Benefits of Publishing with IEEE Updated PROD-0073 Print Fix - Author PPT.
National Seminar on Developing a Program for the Implementation of the 2008 SNA and Supporting Statistics in Turkey Arzu TOKDEMİR 10 September 2013 Ankara.
Hickey2/12/20141 CORC CORC Cooperative Online Resource Catalog T. Hickey.
Module 2 – Monitoring and Evaluation Definitions.
Module 6 – Evaluation Methods and Techniques. 13/02/20142 Questions and criteria Methods and techniques Quality How the evaluation will be done Overview.
2/14/20141 Wabasso: Utilizing the PACE Methodology As a Tool for Improving the Quality of Life in Communities Presentation by: Julianne Price Indian River.
BUS 220: ELEMENTARY STATISTICS
IEEE Chapter Symposium
Incorporating Site-Level Knowledge for Incremental Crawling of Web Forums: A List-wise Strategy Jiang-Ming Yang, Rui Cai, Lei Zhang, and Wei-Ying Ma Microsoft.
Identifying and Accessing Relevant Public and Private Databases: Trademarks Amanda Fila Myers Economist US Patent and Trademark Office
18/02/20141 ©Sustainable Cities Research Institute Conceptualising Urban-Rural Linkages for Sustainable Development Bob Evans Sustainable Cities Research.
Welcome Welcome to the next session in the professional development program focused around the 9-12 Mathematics Standards. 3/1/20141Geometry.
4/6/20141 GC16/3011 Functional Programming Lecture 8 Designing Functional Programs (2)
Make My Day Findings from the inquiry into older people and human rights in home care.
Mitglied in der Helmholtz-Gemeinschaft WCS Server for CF-NetCDF An Overview AQCoP Meeting, August 2011 | M. Decker, M. Schultz, K. Hoijarvi, R.B. Husar.
SAP-Customizing SAP-Customizing.

Nordic Council of Ministers Friday, May 30, The Nordic Council of Ministers and the EU Baltic Sea Strategy.
PG&E SharePoint Users Group
School Bus Evacuation Drill. Developed by the SDPBC
6/1/20141 The Legislative Process in Alaska 6/1/20142 Courtesy of the Juneau Legislative Information Office.
Grade-3 Pine View School Mrs. Seider’s class
6/3/20141 Credit Policy and Household Level Data Kinnon Scott DECRG World Bank Data on Access of Poor and Low Income People to Financial Services.
Oracle Rally Applications Modernization. 4 June About the Company Founded in 2002 Unites high-level information technology and organization architecture.
6/8/20141 Market Structures Chapter 7 Section 1 – Competition & Market Structures In this section, you will learn that market structures include perfect.
© 2007 Cisco Systems, Inc. All rights reserved. 1 Valašské Meziříčí Networking Media.
6/10/20141 Top-Down Clustering Method Based On TV-Tree Zbigniew W. Ras.
6/10/20141 TV-Anytime An adaptation to DVB Transport Streams and Implementation in European Projects David White, NDS Ronald Tol, Philips.
Submission Writing Master Class Gerard Byrne B Comm FCPA FAIM Townsville, 17 April 2010 Thursday, June 12,
June 12, Mobile Computing COE 446 Network Planning Tarek Sheltami KFUPM CCSE COE Principles of Wireless.
All of Lab. 4 using Services Audio, LED, GPIO, LightSensor, Thermal Sensors, LCD Code by A. Tran / M. Smith Talk by M. Smith with slides from D. Lannigan.
6/14/20141 A Cluster Formation Algorithm with Self-Adaptive Population for Wireless Sensor Networks Luis J. Gonzalez.
UUCS Congregational Meeting December 5, /25/20141.
REsearch network for forward looking activities and assessment of research and innovation prospects in the fields of Climate, Resource Efficiency and raw.
10/4/20141 WP2 Discovery mechanism of the OpenKnowledge system (“Semantic routing”) (presented by Ronny Siebes) OpenKnowledge project review WP2 -Discovery.
10/6/20141 The PeopleSide of Change Agenda Why is the People Side of Change Important Components of a Successful Change Program How We Get There.
10/8/20141 DV for Tax Module 5 Wage Item Validation.
Project Quality Management
1 Small group teaching. 10/10/ What is a small group: Small groups are not determined by number, but by certain characteristics: – Active student.
10/11/20141 MART Managers’ Conference G. George Wallin, PhD, MBA Vice President/Chief Operating Officer Sherburne TeleSystems, Inc.
Sybase PowerBuilder Applications Modernization. 11 October About the Company Founded in 2002 Unites high-level information technology and organization.
08/01/ Final Conference The SONETOR platform- Functionalities and services Catherine Christodoulopoulou CTI.
Session Agenda  What is WebCRD?  The four ways to place an order  Placing an order from an application  Uploading a document  Placing a Catalog order.
10/12/20141Chem-160. Covalent Bonds 10/12/20142Chem-160.
No Marking Required. Discharges and Environmental Monitoring 2012 Jim Desmond Environmental Monitoring & Assessments Manager WCSSG Environmental Health.
12/10/ Guy Platten Caledonian Maritime Assets Ltd (CMAL) 12/10/20142 Hybrid Ferries – the opportunity for Scotland.
10/22/20141 International Trade and Exchange Rates Chapter 12.
10/22/20141 Consumers, Producers, and the Efficiency of Markets Chapter 7.
MarcEdit "A Closer Look at Productivity Tools” NETSL 2014 Apr. 11, pm.
Are electronic portfolios the future? Dr Siobhán O’ Sullivan Curriculum Development Manager Structured PhD program in Life Sciences AHECS Presentation.
Winlink Presentation (Week 2)
Create Reports Individual Student Reports > 90 Days Old KDE:OAA:pp: 9/5/2014 1
Propositional Predicate
Presentation transcript:

Intersection Schemas as a Dataspace Integration Technique 8/21/20141 Richard BrownlowAlex Poulovassilis

Contribution A new methodology for lightweight data integration in an incremental pay-as-you-go environment based on the concept of “Intersection Schemas”, utilising bidirectional transformations at a schema level. Improve on existing workflows for data integration, to increase the productivity of the incremental Data Integration process. Development of a demonstrator and user interface to aid the data integrator 8/21/20142

Intersection Schemas Implements a framework for incremental data integration. A component within the existing AutoMed data integration framework. Introduces a new “pay-as-you-go” technique of Intersection Schemas. This allows the integrator to incrementally identify intersections between schemas, and integrate them into the Global Schema. 8/21/20143

AutoMed Architecture 8/21/20144

Data Integration via Union-compatible Schemas 8/21/20145

Intersection Schema 8/21/20146

Integrated Intersection and Extensional Schemas 8/21/20147

Global schema derived from Intersection and Extensional Schemas 8/21/20148

Case Study ISpider Proteomics data from three different data sources Mappings defined by domain experts Mappings constitute the domain knowledge 8/21/20149

Illustrative Use Case Based on iSpider Datasets o Three data sources: gpmDB Pedro Pepseeker 8/21/201410

Illustrative Use Case GUI 8/21/201411

Workflow 1.Identify the extensional schemas representing the set of data sources that are to be integrated. 2.Initially a federated schema is created from the schemas identified in Step 1. 3.Inspect the schemas identified in Step 1 and select two of them from which to derive an intersection schema. 4.Identify mappings between these two schemas and create an intersection schema. 5.A new Global Schema is created automatically from the Intersection Schema and the extensional schemas by our tool. The user may optionally elect for any redundant objects in the new Global schema to be dropped. 6.The user may test the Intersection schema or Global schema at this stage by running queries on it. 7.Repeat Steps 3 to 6 for each integration iteration. 8/21/201412

Evaluation Comparison of Intersection Schema methodology versus a “classical” ladder based integration methodology: For ladder based integration integration: 95 manually defined transformations For Intersection schema based integration: 26 manually defined transformations 8/21/201413

Conclusions We have demonstrated the technique on a real-world data integration scenario and have seen that the number of user-defined steps required to perform the integration is significantly reduced compared to the original data integration methodology used by the domain experts on that project. We have shown how the AutoMed toolkit and bidirectional schema transformations can be used to underpin a new light-weight data integration technique within an incremental pay-as-you-go data integration process. 8/21/201414

Future Work Extending the methodology so that intersections can be created between any number of source schemas at each iteration of the process, rather than just two as at present. Detailed user evaluations. 8/21/201415

Any Questions 8/21/201416

Appendix Example iSpider transformations from original project. 8/21/201417

8/21/201418

8/21/201419