The information integration wizard (Iwiz) project Report on work in progress Joachim Hammer Presented by Muhammed Al-Muhammed.

Slides:



Advertisements
Similar presentations
Data integration and transformation Paolo Atzeni Dipartimento di Informatica e Automazione Università Roma Tre 29/09/2010.
Advertisements

Introduction to Databases
Multi-Phase Reasoning of temporal semantic knowledge Sakirulai O. Isiaq and Taha Osman School of Computer and Informatics Nottingham Trent University Nottingham.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
Helping people find content … preparing content to be found Enabling the Semantic Web Joseph Busch.
Semantic Technologies in Bioinformatics 1© Unicorn Solutions Inc. June 1, 2015.
A New Learning Tools. Topic Maps is a standard for the representation and interchange of knowledge, with an emphasis on the findability of information.
Interactive Generation of Integrated Schemas Laura Chiticariu et al. Presented by: Meher Talat Shaikh.
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
Making the Most of What We Know: Towards Effective Use of Genomics Data Terence Critchlow Center for Applied Scientific Computing Lawrence Livermore National.
Integrating data sources on the World-Wide Web Ramon Lawrence and Ken Barker U. of Manitoba, U. of Calgary
1 Lecture 13: Database Heterogeneity Debriefing Project Phase 2.
DATA WAREHOUSING.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
1 Lecture 13: Database Heterogeneity. 2 Outline Database Integration Wrappers Mediators Integration Conflicts.
CIA 2003 th International Workshop on Cooperative Information Agents CIA th International Workshop on Cooperative Information Agents DIA: Data Integration.
Kamran Munir, M. Odeh, R. McClatchey
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
Semantic Mediation & OWS 8 Glenn Guempel
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Page 1 ISMT E-120 Introduction to Microsoft Access & Relational Databases The Influence of Software and Hardware Technologies on Business Productivity.
Midwest Documentum User Group Harley-Davidson Documentum WCM 10/10/2006.
BUSINESS INTELLIGENCE/DATA INTEGRATION/ETL/INTEGRATION AN INTRODUCTION Presented by: Gautam Sinha.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Composing Models: Principles & Techniques © Copyright TUBS & TUD Composing Models: Principles & Techniques Steven Völkel & Jendrik Johannes.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Database Design - Lecture 1
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Information Systems: Modelling Complexity with Categories Four lectures given by Nick Rossiter at Universidad de Las Palmas de Gran Canaria, 15th-19th.
Mihir Daptardar Software Engineering 577b Center for Systems and Software Engineering (CSSE) Viterbi School of Engineering 1.
Database System Concepts and Architecture
Peer-to-Peer Data Integration Using Distributed Bridges Neal Arthorne B. Eng. Computer Systems (2002) Supervisor: Babak Esfandiari April 12, 2005 Candidate.
Introduction to MDA (Model Driven Architecture) CYT.
Interoperability in Information Schemas Ruben Mendes Orientador: Prof. José Borbinha MEIC-Tagus Instituto Superior Técnico.
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
Alignment of ATL and QVT © 2006 ATLAS Nantes Alignment of ATL and QVT Ivan Kurtev ATLAS group, INRIA & University of Nantes, France
“Solving Data Inconsistencies and Data Integration with a Data Quality Manager” Presented by Maria del Pilar Angeles, Lachlan M.MacKinnon School of Mathematical.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
XML & Mediators Thitima Sirikangwalkul Wai Sum Mong April 10, 2003.
P15 Lai Xiaoni (U077151L) Qiao Li (U077194E) Saw Woei Yuh (U077146X) Wang Yong (U077138Y)
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
From Objects to Assets: The Fungibility of Knowledge Christopher W. Higgins, Esq.
1 © 1999 Microsoft Corp.. Microsoft Repository Phil Bernstein Microsoft Corp.
Database Systems DBMS Environment Data Abstraction.
Interoperability & Knowledge Sharing Advisor: Dr. Sudha Ram Dr. Jinsoo Park Kangsuk Kim (former MS Student) Yousub Hwang (Ph.D. Student)
Prepared By Aakanksha Agrawal & Richa Pandey Mtech CSE 3 rd SEM.
Object Oriented Multi-Database Systems An Overview of Chapters 4 and 5.
Database Environment Chapter 2. Data Independence Sometimes the way data are physically organized depends on the requirements of the application. Result:
Building a Topic Map Repository Xia Lin Drexel University Philadelphia, PA Jian Qin Syracuse University Syracuse, NY * Presented at Knowledge Technologies.
Rainbow: XML and Relational Database Design, Implementation, Test, and Evaluation Project Members: Tien Vu, Mirek Cymer, John Lee Advisor:
ReSeTrus Development of a digital library technology based on redundancy elimination and semantic elevation, with special emphasis on trust management.
AT&T Government Solutions, Inc. Patrick Emery Lewis Hart or
1 Resolving Schematic Discrepancy in the Integration of Entity-Relationship Schemas Qi He Tok Wang Ling Dept. of Computer Science School of Computing National.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Information Integration 15 th Meeting Course Name: Business Intelligence Year: 2009.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
Class Diagrams. Terms and Concepts A class diagram is a diagram that shows a set of classes, interfaces, and collaborations and their relationships.
Yu, et al.’s “A Model-Driven Development Framework for Enterprise Web Services” In proceedings of the 10 th IEEE Intl Enterprise Distributed Object Computing.
2005 All Hands Meeting Data & Data Integration Working Group Summary.
1 Copyright © Oracle Corporation, All rights reserved. Business Intelligence and Data Warehousing.
9 th Open Forum on Metadata Registries Harmonization of Terminology, Ontology and Metadata 20th – 22nd March, 2006, Kobe Japan. Day: 3 Slot No. P20 Name:Ian.
Mechanisms for Requirements Driven Component Selection and Design Automation 최경석.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Grid Metadata Management
Data Warehouse.
Presentation transcript:

The information integration wizard (Iwiz) project Report on work in progress Joachim Hammer Presented by Muhammed Al-Muhammed

introduction - people use internet to find information of interest. * that is easy if all information available in the same place. - But this not the case nowadays! - The information of interest could be located in multiple sources. So what we can do? - if all sources use the same tools and data modeling to create and manage their data the problem of finding information of interest is no longer problem!

-but what if these sources use different tools, hardware, software platforms to manage their data (heterogeneity at a peak). What possible problems? Some could be: - schematic problems. - semantics problems. -So what can we do? Obvious solution is to use a tool the can overcome heterogeneity problems and decentralization of information sources. This the reason why the data integration is important.

-So what benefits users get out of data integration tools? The greatest benefits are : 1.the user does not have to worry about what sources are available; 2. where they are located; 3. how the data is represented in each source; 4.and how each data source is queried;

Goals of the project help users get information from heterogeneous sources. How they achieve this goal? Build integration system using hybrid data “warehousing / Mediators” approach. Warehousing stores frequently accessed data. Mediator supports on-demand queries if the data is not available in the warehouse.

What issues must be investigated in order to achieve these goals? 1.Common data model and representation, I.e. what data model can be used to represent the information in the integrated system?. They chose XML for their system. Because it has some nice features such as clear separation of the data and schema.

2. Defining global schema to provide a representation of relevant data tailored to the user’s needs. 3. Semantic heterogeneities (huge problem) what hurdles caused by heterogeneity: - understanding the meaning of the source data - relating it to the global schema. - translate values from source to target context - merging related data

heterogeneity faced at 3 levels: - System level :Hardware, operating system. - Data management :difference in the data models, access commands.. - Semantic level :the difference in the way related or similar data is represented in different sources.

How the three levels of heterogeneity can be overcome? The first two are overcome by translators and adapters. The third one is the serious one! The following diagram gives some idea about the kind of heterogeneities.

So what we can do to deal with heterogeneity problem? * To overcome heterogeneity, mapping needed. -Mapping can be done by two steps 1- schema restructuring; eliminate syntax and semantic inconsistencies between the source schema and global schema. 2- schema merging; removal duplications, removal of inconsistent data..

4- Knowledge representation ; a common metadata knowledgebase to reason about the meaning of and relationships among concepts. “To deal with the issues, they proposed a system called Iwiz.” - Iwiz architecture

Transform from XML source  target schema XML

Restructuring and Merging The goals of this are: 1- generate rules for converting data from its native source  global schema. 2- populate the global target schema with data. How the data restructured and merged?

Data restructuring and merging

How the data accessed- queries we distinguish 2 versions: version1:

Version1.5

Version 2 – not built yet!