Dynamic Database Integration in a JDBC Driver Terrence Mason and Dr. Ramon Lawrence Iowa Database and Emerging Application Laboratory University of Iowa.

Slides:



Advertisements
Similar presentations
Connecting to Databases. relational databases tables and relations accessed using SQL database -specific functionality –transaction processing commit.
Advertisements

Michael Pizzo Software Architect Data Programmability Microsoft Corporation.
DISCOVER: Keyword Search in Relational Databases Vagelis Hristidis University of California, San Diego Yannis Papakonstantinou University of California,
Introduction to JDBC Standard framework for dealing with tabular and generally, relational data SQL (Structured Query Language) is standardized language.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
A JDBC driver supporting Data Source Integration Jian Jia.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Introduction to SQL Programming Techniques.
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
Object Oriented Programming Java Java’s JDBC Allows access to any ANSI SQL-2 DBMS Does its work in terms of SQL The JDBC has classes that represent:
1 Lecture 05: Database Programming (JDBC). 2 Outline JDBC overview JDBC API Reading: Chapter 10.5 Pointbase Developer Manual.
Summary. Chapter 9 – Triggers Integrity constraints Enforcing IC with different techniques –Keys –Foreign keys –Attribute-based constraints –Schema-based.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
Automatic Data Ramon Lawrence University of Manitoba
INTEGRATION INTEGRATION Ramon Lawrence University of Iowa
1 Foundations of Software Design Lecture 27: Java Database Programming Marti Hearst Fall 2002.
AutoJoin: Providing Freedom from Specifying Joins Terrence Mason Lixin Wang
SQL Server 2000 and XML Erik Veerman Consultant Intellinet Business Intelligence.
UFCE4Y UFCE4Y-20-3 Components and Services Julia Dawson.
Java MS Access database connectivity Follow these steps: 1)Go to the start->Control Panel->Administrative Tools- > data sources. 2)Click Add button and.
CIS 270—App Dev II Big Java Chapter 22 Relational Databases.
Advance Computer Programming Java Database Connectivity (JDBC) – In order to connect a Java application to a database, you need to use a JDBC driver. –
Database System Concepts and Architecture Lecture # 3 22 June 2012 National University of Computer and Emerging Sciences.
Helena Pomezná, ciz034 St. skupina: L392 FEI, VŠB-TUO Ak. rok. 2002/2003 Download:
CSCI 6962: Server-side Design and Programming JDBC Database Programming.
Snap-Together Visualization Chris North Lab for Information Visualization and Evaluation Department of Computer Science Virginia Tech.
Database Environment Chapter 2 AIT632 Sungchul Hong.
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
Midterm Exam Chapters 1,2,3,5, 6,7 (closed book) March 11, 2014.
Dr. Magdi AMER Unit 2 Introduction to Database. Intro Many programs need to save information on disk. The role of DB system is to provide a layer of abstraction.
MySQL, Java, and JDBC CSE 3330 Southern Methodist University.
12-CRS-0106 REVISED 8 FEB 2013 CSG2H3 Object Oriented Programming.
CS 405G: Introduction to Database Systems Database programming.
JDBC. JDBC stands for Java Data Base Connectivity. JDBC is different from ODBC in that – JDBC is written in Java (hence is platform independent, object.
Chapter 8 Databases.
WEB/DB1 DATABASE PROGRAMMING 3JDBC by the ASU Scholars.
Chapter 25 Databases. Chapter Scope Database concepts Tables and queries SQL statements Managing data in a database Java Foundations, 3rd Edition, Lewis/DePasquale/Chase25.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
Copyright © 2002 ProsoftTraining. All rights reserved. Building Database Client Applications Using JDBC 2.0.
JDBC. Java.sql.package The java.sql package contains various interfaces and classes used by the JDBC API. This collection of interfaces and classes enable.
12/6/2015B.Ramamurthy1 Java Database Connectivity B.Ramamurthy.
Java and Databases. JDBC Architecture Java Application JDBC API Data Base Drivers AccessSQL Server DB2InformixMySQLSybase.
COMP 321 Week 4. Overview Normalization Entity-Relationship Diagrams SQL JDBC/JDBC Drivers hsqldb Lab 4-1 Introduction.
CSI 3125, Preliminaries, page 1 JDBC. CSI 3125, Preliminaries, page 2 JDBC JDBC stands for Java Database Connectivity, which is a standard Java API (application.
Access Databases from Java Programs via JDBC Tessema M. Mengistu Department of Computer Science Southern Illinois University Carbondale
Object storage and object interoperability
Database Management Systems 1 Raghu Ramakrishnan Database Application Development Chpt 6 Xin Zhang.
Advanced Java Session 5 New York University School of Continuing and Professional Studies.
Database Programming With Java & JDBC Reading: DD Ch. 18, pp al/jdbc/index.html, or anything covering JDBC.
Java Object-Relational Layer Sharon Diskin GUS 3.0 Workshop June 18-21, 2002.
Java and database. 3 Relational Databases A relational Database consists of a set of simple rectangular tables or relations The column headings are.
Copyright © 2016 Ramez Elmasri and Shamkant B. Navathe.
Instructor: Jinze Liu Fall /8/2016Jinze University of Kentucky 2 Database Project Database Architecture Database programming.
DEPTT. OF COMP. SC & APPLICATIONS
Lec - 14.
ODBC, OCCI and JDBC overview
Reflection API, JDBC, Hibernate, RMI
JAVA Connection The following uses a ‘bridge’ between Java Database Connectivity (JDBC) and ODBC, namely sun.jdbc.odbc.JdbcOdbcDriver Supplied with the.
Advanced Web Automation Using Selenium
HW#4 Making Simple BBS Using JDBC
Prof: Dr. Shu-Ching Chen TA: Sheng Guan
Database management concepts
JDBC – ODBC DRIVERS.
Super Market Management
Java Database Connectivity
Bolat Azamat, Kim Dongmin
Database management concepts
JDBC – Java DataBase Connectivity
Course Instructor: Supriya Gupta Asstt. Prof
Java Chapter 6 (Estifanos Tilahun Mihret--Tech with Estif)
Presentation transcript:

Dynamic Database Integration in a JDBC Driver Terrence Mason and Dr. Ramon Lawrence Iowa Database and Emerging Application Laboratory University of Iowa 7th International Conference on Enterprise Information Systems ICEIS 2005 Miami, Florida

Discuss the contributions of JDBC Driver Review the Architecture Step through an example integration and query (Partitioned TPC-H* dataset) Review the experimental results Demonstrate efficient Database Integration * Presentation

Contributions to Database Integration Standard API for Integration (JDBC) Automatic generation of a global view of integrated data sources –Annotation done locally –Common Vocabulary (National Cancer Institute-EVS) –Scalable to build a global schema Simple Conceptual Query Language Automatic Join Determination for queries Allows evolution of data sources Detects inconsistent data across sources

Unity JDBC Driver Architecture DB 1 DB 2 DB n Embedded Database Engine JDBC SQL Unity JDBC Driver Java Application Semantic Query Results

Extending Standard JDBC API for Integration Standard Java Interfaces for Single Database JDBC Connections extended to Multiple Databases –Connection –Driver Manager –Statement –Result Set

Java Code for JDBC Integration import java.sql.*; public class JDBCApplication { public static void main(String[] args) { {String url = “jdbc:unity://sources.xml"; (1) Connection con;(2) // Load UnityDriver class try { { Class.forName(``unity.jdbc.UnityDriver");} } (3) catch (java.lang.ClassNotFoundException e) { System.exit(1); }(4) try { //Initiate connection (5) con = DriverManager.getConnection(url);(6) Statement stmt = con.createStatement(); (7) ResultSet rst = stmt.executeQuery(`SELECT Part.Name, (8) LineItem.Quantity, Customer.Name (9) WHERE Customer.Name='Customer_25’ ” ); } (10) System.out.println(``Part, Quantity, Customer");(11) while (rst.next()) (12) { System.out.println(rst.getString(``Part.Name")(13) +”,”+rst.getString(``LineItem.Quantity")(14) +”,”+rst.getString(``Customer.Name") ); (15) } con.close(); (16) } (17) catch (SQLException ex) { System.exit(1); } (18) }

XML File to Reference Data Sources SOURCES> <DATABASE> jdbc:microsoft:sqlserver://IDEALAB5.cs.uiowa.edu:1433;DatabaseName=TPC; User=terry;Password=xxxxx jdbc:microsoft:sqlserver://IDEALAB5.cs.uiowa.edu:1433;DatabaseName=TPC; User=terry;Password=xxxxx <DRIVER>com.microsoft.jdbc.sqlserver.SQLServerDriver</DRIVER><XSPEC>xspec/Order.xml</XSPEC></DATABASE> jdbc:microsoft:sqlserver://IDEALAB3.cs.uiowa.edu:1433;DatabaseName=TPC; User=terry;Password=yyyyyy jdbc:microsoft:sqlserver://IDEALAB3.cs.uiowa.edu:1433;DatabaseName=TPC; User=terry;Password=yyyyyy <DRIVER>com.microsoft.jdbc.sqlserver.SQLServerDriver</DRIVER><XSPEC>xspec/Part.xml</XSPEC></DATABASE></SOURCES>

- Order Microsoft SQL Server 2000 Microsoft SQL Server Customer Customer CUSTOMER CUSTOMER - Customer.Id C_CUSTKEY C_CUSTKEY int int - Customer.Name C_NAME C_NAME varchar varchar Customer.Nation.Id C_NATIONKEY C_NATIONKEY int int - 4 Organization C_CUSTKEY C_CUSTKEY Organization Organization C_NATIONKEY C_NATIONKEY NATION NATION - CUSTOMER->NATION CUSTOMER->NATION FK__CUSTOMER__C_NATI__7A672E12 FK__CUSTOMER__C_NATI__7A672E12 CUSTOMER CUSTOMER PK__NATION__6E01572D PK__NATION__6E01572D NATION NATION 3 3 Order.xml file (XSpec) Schema Information Table Fields Primary key Foreign key Join Order Database Annotation -Semantic Names Scope of Keys (Global joins) XML Document created Semi-automatically Schema Information - Extracted Automatically from Database Annotation and Scopes – Semi-automatically

Order Database customer(c_custkey, c_name, c_nationkey) orders(o_orderkey,o_custkey,o_orderdate) lineitem(l_orderkey,l_partkey,l_suppkey,l_linenum,l_qty) nation(n_nationkey, n_name,n_regionkey) region(r_regionkey, r_name) Part Database part(p_partkey, p_name, p_mfgr) supplier(s_suppkey, s_name, s_nationkey) partsupp(ps_partkey,ps_suppkey) nation(n_nationkey, n_name, n_regionkey) region(r_regionkey, r_name) Global Schema Part.Id, Part.Name, Part.Manufacturer Supplier.Id, Supplier.Name, Supplier.Nation.Id Order.Id, Order.Customer.Id, Order.Date LineItem.Linenumber, LineItem.Order.Id LineItem.Quantity, LineItem.Part.Id, LineItem.Supplier.Id, Customer.Id, Customer.Name, Customer.Nation.Id Nation.Id, Nation.Name, Nation.Region.Id Region.Id, Region.Name Build Global Schema On Local Database Annotations

Global Schema Part.Id, Part.Name, Part.Manufacturer Supplier.Id, Supplier.Name, Supplier.Nation.Id Order.Id, Order.Customer.Id, Order.Date LineItem.Linenumber, LineItem.Order.Id LineItem.Quantity, LineItem.Part.Id, LineItem.Supplier.Id, Customer.Id, Customer.Name, Customer.Nation.Id Nation.Id, Nation.Name, Nation.Region.Id Region.Id, Region.Name Attribute Only SQL Global Schema Query Language on concepts in Global Schema No FROM clause –Tables not specified Selection conditions on concepts Order by Query: SELECT Part.Name, LineItem.Quantity, Customer.Name WHERE Customer.Name = 'Customer# '

QueryProcessing Steps Query Processing Steps Parse Semantic Query – –Validate concepts – –Create parse tree Map concepts to fields in local databases Determine joins to relate attributes in each local database Build Execution Tree (Relational Algebra) – –Execute a sub-query to each local database – –Find global join or union to relate sub-queries – –Combine sub-queries into single result set

Conceptual Query and Parse Tree Parse Tree: SELECT Identifier: Part.Name Identifier: LineItem.Quantity Identifier: Customer.Name WHERE Comparison_Op: = Identifier: Customer.Name String: 'Customer# ' Conceptual Query: SELECT Part.Name, LineItem.Quantity, Customer.Name WHERE Customer.Name = 'Customer# '

Join Graph Construction Graph represents joins for each local database Edges directed as N:1joins Automatically extracted into XSpec or added to the XSpec. Used to calculate joins for each sub-query

Nation Line Item Order Customer Region Database Join Graphs Part Supp Nation SupplierPart Region Part Database Order Database

Nation Line Item Order Customer Region Map the Concepts in query to Relations Part Supp Nation SupplierPart Region Part Database Order Database Part.Name, LineItem.Quantity, Customer.Name

Line Item Order Customer Determine Local Joins Steiner Tree Approximation Algorithm Part Part Database Order Database Global Join LineItem.Part.Id is foreign key to Part.Id Semantic Query: SELECT Part.Name, LineItem.Quantity, Customer.Name Where Customer.Name = 'Customer# '

Build Execution Tree Relational Algebra Projection –Concepts in SELECT portion of conceptual query –Sub-query projections of required fields (global joins) Selection –WHERE conditions of conceptual query Joins –Determined from Join Graphs –Global joins identified by key scopes

Sub-queries Sent to Each Local Database SQL through JDBC Part Database: SELECT P.P_NAME, P.P_PARTKEY FROM PART AS P Order Database: SELECT L.L_QUANTITY, C.C_NAME, L.L_PARTKEY FROM LINEITEM AS L, CUSTOMER AS C, ORDERS AS O WHERE C.C_NAME = 'Customer# ' AND O.O_CUSTKEY = C.C_CUSTKEY AND L.L_ORDERKEY = O.O_ORDERKEY Local joins determined from join graphs Selection ConditionSelection Condition Elements added to queries in order for the global join to be executed in Unity Driver.

Operator Execution Tree Idealab5 Database Server Idealab1 Client (Unity Driver) Unity Embedded Database Engine Idealab3 Database Server

Experimental Results Dynamic integration is efficient and scalable Minimal overhead Multi-source query processing –Competitive with single source execution –Possible to execute queries on a global schema

Schema Integration Results Multiple Copies of TPC-H (Seconds) Number of Schemas Integrate SchemasConnect Parse Sub-queries Total Time Integration of schemas occurs in linear time based on number of schemas integrated. Integration and Connection executes only once at start up. Not for each query.

Query Small Result Size (76 Tuples) * Only 76 tuples transported over network for single sub-query * Separate requires entire Part table imported to Unity for join

Query Large Result Size (6,000,215 tuples) * Distributed execution of the queries on multiple computers executed faster than a single database server due to parallelism for this particular query.

Conclusions Integration possible in a JDBC Driver Local Annotation permits scalable integrations Minimal Overhead to Process Queries Query Multiple Database on a Global View –No need to specify joins –No requirement to know underlying schemas

Future Works AutoJoin – Scalable inference engine for join determination Improve global query inference Sophisticated Global Query Optimizer Extend to support Federated Database Queries –No global schema –Fully Specified Queries

Queries to Test Unity Performance (Data Labels on Charts) TPC-H - Conceptual query executed through Unity driver against a single source TPC-H database. JDBC TPC-H - SQL query equivalent to conceptual query executed directly through SQL Server JDBC driver on a single source TPC-H database. Partitioned on One Computer - Conceptual query executed on TPC-H data set virtually partitioned into the Part and Order databases