Information Integration(cntd.)

Slides:



Advertisements
Similar presentations
CS 245Notes 141 CS 245: Database System Principles Notes 14: Coping with Limited Capabilities of Sources Hector Garcia-Molina.
Advertisements

JAYASRI JETTI CHINMAYA KRISHNA SURYADEVARA
Wrappers in Mediator-Based Systems Chapter 21.3 Information Integration Presented By Annie Hii Toderici.
Corporate Property Automated Information System (CPAIS) Macro Walkthrough Guide for Excel Version 2003.
SECTION 21.5 Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.
Foundations of Relational Implementation n Defining Relational Data n Relational Data Manipulation n Relational Algebra.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
CS 257 Database Systems Principles Assignment 1 Instructor: Student: Dr. T. Y. Lin Rajan Vyas (119)
Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS Spring 2009 Instructor: Dr. T.Y.Lin.
SECTIONS 21.4 – 21.5 Sanuja Dabade & Eilbroun Benjamin CS 257 – Dr. TY Lin INFORMATION INTEGRATION.
CS 257 Database Systems Principles Assignment 1 Instructor: Student: Dr. T. Y. Lin Rajan Vyas (119)
Capability-Based Optimization in Mediators Rohit Deshmukh ID 120 CS-257 Rohit Deshmukh ID 120 CS-257.
Automating Tasks With Macros. 2 Design a switchboard and dialog box for a graphical user interface Database developers interact directly with Access.
SE 555 Software Requirements & Specification Requirements Validation.
Chapter 21 Information Integration 21.3 Wrappers in Mediator-Based Systems Presented by: Kai Zhu Professor: Dr. T.Y. Lin Class ID: 220.
CH 11 Multimedia IR: Models and Languages
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina.
Summary of query compilers (Section16.8) Varun Gupta Department of Computer Science ID-216 CS 257.
Rutgers University Relational Algebra 198:541 Rutgers University.
Relational Algebra Chapter 4 - part I. 2 Relational Query Languages  Query languages: Allow manipulation and retrieval of data from a database.  Relational.
CSCD343- Introduction to databases- A. Vaisman1 Relational Algebra.
Chapter 5 Java Script And Forms JavaScript, Third Edition.
Telerik Software Academy ASP.NET Web Forms Data Validation, Data Validators, Validation Groups Telerik Software Academy
Chapter 21.2 Modes of Information Integration ID: 219 Name: Qun Yu Class: CS Spring 2009 Instructor: Dr. T.Y.Lin.
Database Management 9. course. Execution of queries.
CSE 636 Data Integration Limited Source Capabilities Slides by Hector Garcia-Molina Fall 2006.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Linux Operations and Administration
Basic & Advanced Reporting in TIMSNT ** Part Two **
SEMANTEC 1 Oracle Performance Tuning - Part I Krasen Paskalev Oracle 8i Certified DBA.
Submitted by: Deepti Kundu Submitted to: Dr.T.Y.Lin
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
INFORMATION INTEGRATION Shengyu Li CS-257 ID-211.
LATTICE TECHNOLOGY, INC. For Version 10.0 and later XVL Web Master Tutorial For Version 10.0 and later.
1 Relational Algebra Chapter 4, Sections 4.1 – 4.2.
DBMS2001Notes 10: Information Integration1 Principles of Database Management Systems 10: Information Integration Pekka Kilpeläinen University of Kuopio.
COM362 Knowledge Engineering Inferencing 1 Inferencing: Forward and Backward Chaining John MacIntyre
CSCD34-Data Management Systems - A. Vaisman1 Relational Algebra.
Information Integration By Neel Bavishi. Mediator Introduction A mediator supports a virtual view or collection of views that integrates several sources.
Access Chapter 3-Obtaining Answers to Your Data Questions.
Generating Data for Assignment 9. Macro security policies Excel contains a programming language called Visual Basic for Applications that can be used.
XP New Perspectives on Microsoft Office Access 2003 Tutorial 10 1 Microsoft Office Access 2003 Tutorial 10 – Automating Tasks With Macros.
Wrappers in Mediator-Based Systems. Introduction Mediator Wrapper Source 1 Source 2 Query Result.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Algorithm Discovery and Design Objectives: Interpret pseudocode Write pseudocode, using the three types of operations: * sequential (steps in order written)
© The McGraw-Hill Companies, 2006 Chapter 3 Iteration.
1 Efficient Computation of Diverse Query Results Erik Vee joint work with Utkarsh Srivastava, Jayavel Shanmugasundaram, Prashant Bhat, Sihem Amer Yahia.
Section 20.1 Modes of Information Integration Anilkumar Panicker CS257: Database Systems ID: 118.
UNIT 3 (Extra) Overview of C Programming.
CS223: Software Engineering Lecture 18: The XP. Recap Introduction to Agile Methodology Customer centric approach Issues of Agile methodology Where to.
Import existing part with drawing
Databases (CS507) CHAPTER 2.
Capability-Sensitive Query Processing on Internet Sources
IST 220 – Intro to Databases
Wine Expert System Created to select wine for a meal Simulated Expert
Database application MySQL Database and PhpMyAdmin
Presented by: Kai Zhu Professor: Dr. T.Y. Lin Class ID: 220
Chapter 12: Query Processing
Relational Algebra Chapter 4, Part A
iVend Retail Extensibility
Preliminaries: -- vector, raster, shapefiles, feature classes.
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Introduction to pseudocode
Comparative Reporting & Analysis (CR&A)
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Guidelines for Microsoft® Office 2013
Web Programming Assignment #4: Searching & Notification
Computational Advertising and
Creating Additional Input Items
Presentation transcript:

Information Integration(cntd.) Nikita Ramesh

Agenda Capability based optimization Notations for describing source capabilities Examples Capability based query plan selection Adding cost based optimization

Capability based optimization Query optimization: Optimize query to obtain efficiency Cost based optimization: Optimizer looks at all possible ways or scenarios in which a query can be executed. Each scenario is assigned a cost, which indicates how efficiently it can be run. Cost based optimizer will pick the least cost and execute the query using that scenario.

Capability based optimization When mediator is given a query, it has little knowledge about the time it will take to answer the query Often, data sources will only answer subset of the query Hence optimization cannot rely on cost measures alone Hence, capability based optimization is used Central issue is not cost, but whether query plan can be executed at all If plans are executable, find cost

Capability based optimization Problem of limited source capabilities: Many sources only have a Web based interface These sources permit querying through a query form Do not accept SQL queries (eg: Amazon) Legacy systems Security Indexing makes certain queries feasible while others too expensive

Notations for describing source capabilities f(free): attribute can be specified or not, as we choose b(bound): we must specify a value for the attribute, any value is allowed u(unspecified): we are not permitted to specify value for the attribute c[S]: choice from set S o[S]: optional from set S We place a prime e.g. f’ on a code if the attribute is not a part of the output query

Cars(serialNo, model, color, autoTrans, navi) Example 1 Cars(serialNo, model, color, autoTrans, navi) User specifies serial no. All other attributes are produced as output Adornment is: b’uuuu i.e. 1st attribute must be specified, and is not part of the output Other attributes must not be specified and are part of the output

Cars(serialNo, model, color, autoTrans, navi) Example 2 Cars(serialNo, model, color, autoTrans, navi) User specifies a model and color, and perhaps whether or not automatic transmission and navigation system are wanted All attributes printed for matching cars Adornment is: ubbo[yes, no]o[yes, no] i.e. 1st attribute must be specified, and is not part of the output Other attributes must not be specified and are part of the output

Capability based query plan selection capability-based query optimizer first considers what queries it can ask at the sources to help answer the query some more queries at the sources are possible repeat the process till: We have asked enough queries at the sources to resolve all the conditions of the mediator queries. Such a plan is called feasible We can construct no more valid forms of source queries. In which case the mediator must give up

Capability based query plan selection Autos(serial, model, color) Options(serial, option) ubf : adornment for Autos bu and uc[autoTrans, navi] : adornments for Options Find serial numbers and colors for Toyota models with a navigation system Specifying that the model is Toyota, query Autos and get the serial numbers and colors of all Toyotas. Then, using the bu adornment for Options, for each such serial number, find the options for that car and filter to make sure it has a navigation system Specifying the navigation-system option, query Options using the uc[autoTrans, navi] adornment and get all the serial numbers for cars with a navigation system. Query Autos to get serial numbers and colors, and intersect the two sets of serial numbers.

Adding cost based optimization After capabilities of the source are examined, feasible plans are found. After this, cost based optimization is added. Different costs are determined for the feasible plans.

Thank you!