W HY NOT USE F EDERATED APPROACH FOR D ATABASE M ANAGEMENT S YSTEM (DBMS)? Yan Cui ITK478 Position paper.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

ICS 434 Advanced Database Systems
Database System Concepts and Architecture
Database Architectures and the Web
Distributed databases
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Yan Cui ITK478 Position paper. Issues in enterprises “…organizations merge or takeover since the existing systems have been designed for different corporate.
Manajemen Basis Data Pertemuan 9 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Overview Distributed vs. decentralized Why distributed databases
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
12 Chapter 12 Client/Server Systems Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Databases
Distributed databases
Distributed Databases and DBMSs: Concepts and Design
The Client/Server Database Environment
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
DISTRIBUTED DATABASES AND DDBMS.  Understand the concept of “Distributed Data”  Describe various Distributed Data and DDBMS implementations  Explain.
1 Overview of Database Federation and IBM Garlic Project Presented by Xiaofen He.
Distributed Database The University of California Berkeley Extension Copyright © 2011 Patrick McDermott.
Database Architectures and the Web
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Architectures and the Web Session 5
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Session-9 Data Management for Decision Support
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Distributed Database Systems Overview
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Distributed Databases Midterm review. Lectures covered Everything until (including) March 2 nd Everything until (including) March 2 nd Focus on distributed.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
MBA 664 Database Management Systems Dave Salisbury ( )
Basics of JDBC Session 14.
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Chapter 1 Database Access from Client Applications.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Distributed Databases
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
LM 9. Distributed Database Dr. Lei Li 1. Note: The content of the slides including figures are mainly based on a publicly available textbook chapter:
The Client/Server Database Environment
Distributed Database Management Systems
Database Architectures and the Web
Physical Database Design
Distributed Databases
Database Architecture
Presentation transcript:

W HY NOT USE F EDERATED APPROACH FOR D ATABASE M ANAGEMENT S YSTEM (DBMS)? Yan Cui ITK478 Position paper

CRUCIAL ISSUES IN ENTERPRISES “…organizations merge or takeover since the existing systems have been designed for different corporate needs, the resulting enterprise will have to face information inconsistency, heterogeneity and incompatible overlap”. Wijegunartne, Fernandez and Vltoudis in [1] “…a large modern enterprise, it is also inevitable that …use different database systems to store and search their critical data. Competition, evolving technology, mergers, acquisitions, geographic distribution, and … decentralization of growth…” Haas and Lin in [2]

DBMS APPROACHES Compare two major approaches Federated database system approach Distributed database system approach Comparison in their architectures/designs, transparency, integration, autonomy, and others.

D ISTRIBUTED DBMS Definition of Distributed database (DDBS) and Distributed Database Management System (DBMS) Centralized and distributed databases conversion Distributed DBMS design

D ISTRIBUTED DBMS ( CONT ) Definition of Distributed database (DDBS) and Distributed Database Management System (DBMS) Distributed database – “a collection of multiple, logically interrelated database distributed over a computer network” by M. Özsu and P. Valduriez in [1] Distribute DBMS – “as the software system that permits the management of the DDBS and makes the distribution transparent to the users” by M. Özsu and P. Valduriez in [1].

D ISTRIBUTED DBMS ( CONT ) Centralized and distributed databases conversion Distributed DBMS is more “local autonomy, improved performance, improved reliability/availability, economics, expandability, and shareability” [3]. Fig. 1 - Central Database on a Network [3]Fig. 2 - DDBS Environment [3]

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design - in [4] by F. A. Baião, M. Mattoso and G. Zaverucha, defined “Distribution design involves making decisions on the fragmentation and placement of data across the sites of a computer network” Fragmentation Allocation

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design – Fragmentation Defined as “clustering fragments the information accessed simultaneously by applications” [4]. vertical fragmentation horizontal fragmentation mixed fragmentation

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design – Fragmentation horizontal fragmentation - class instances are distributed across fragments, and also a horizontal fragment of a class contains a subset of the whole class extension [4] Primary (Round-Robin, Hash-partition, and Rang-partition) Derived fragment Fig.3 - Round-robin [5]Fig. 4 - Hash-partition [5] Fig. 5 - Range partition [5]

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design – Fragmentation horizontal fragmentation Derived fragment Fig. 5 - Range partition [5]

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design – Fragmentation horizontal fragmentation - distribute attributes and methods across fragments, as fragment 1(name, GPA) and fragment 2(address, bDate, picture) from student class in Fig. 7 mixed fragmentation – combination of vertical and horizontal fragmentations Fig. 7 – Vertical fragmentation [5]Fig. 8 – Mixed fragmentation [5]

D ISTRIBUTED DBMS ( CONT ) Distributed DBMS design – Allocation by M. Özsu and P. Valduriez in [3] is to distribute all resources/fragments across the nodes/sites of a computer network.

F EDERATED DBMS Definition all data sources are federated and linked together from heterogeneous DBMSs, different locations, relevant/irrelevant and structure/non-structure data, into a unified system by DBMS by L.M. Haas, E.T. Lin and M.A. Roth in [6]. Characteristics of federated DBMS transparency, heterogeneity, a high degree of function, extensibility, openness, autonomy, and optimized performance in [2,6].

F EDERATED DBMS DB2 architecture for database federation user-defined function (UDF) (Scalar and Table UDFs) Wrapper Fig. 9 – DB2 architecture of database federation [6]

F EDERATED DBMS DB2 architecture for database federation UDF - take input parameters and return either a scalar result or a table of data. Scalar UDF - takes SQL statement as input and returns a scalar result. Table UDF - is the other method which produces table as output from any referenced SQL statements. Select db2mq.mqsend(a.headline) From Articles a Where a.article_timestamp >= CURRENT TIMESTAMP Select a.first, a.last, a.phone, a. From TABLE(addressbook()) AS a, Company_Profiles c Where c.industry = ‘FINANCIAL’ AND c.revenue > 50,000,000 AND c.name = a.company_name Example. 1 - Scalar UDF [6]Example. 2 - Table UDF [6]

F EDERATED DBMS DB2 architecture for database federation Wrapper - as “powerful and flexible infrastructure for federation” in [6] because it integrates both scalar UDF function and Table UDF data Select c.name, a.URL From Compounds c, Experiments e, Articles a Where e.result 0 Example. 3 – Wrapper [6]

C OMPARISON TABLE ComparisonDistributed DBMSFederated DBMS Transparency Very transparency because distributed database needs to be interrelated through communication network. Each site holds its own database. Therefore, users or applications need to know how to interact with database system. Not transparency because it masks from the user the differences, idiosyncrasies, and implementations of the underlying data sources [2]. Therefore, the users not need to aware of location, invocation, dialect, fragmentation, etc. Heterogeneity Very hard to handle for heterogeneity if multiple databases are not interrelated and different networks. Can handle different hardware, network protocols, software, query language, data models. Autonomy Local autonomy because each department have authority to manage their data. Not disturb local operation, moved or modified data, remain application/interface. Data integration Hard if not same network protocols, and multiple DBMS, and not interrelated. It also increases cost and traffic for query. Can be easy to integrate data from different protocols, DMBS, using wrapper. Database access Can be access using ODBC, JDBC, etc, as adapters. Each adapter may be different based on the database system: Oracle using Oracleadapter; SQL using SQLadapter, and Access using OLEadapter. Each programming language has its own embedded SQL. Using Xperanto as middleware layer to access any DBMSs with simple programming model. Application can push XML as standard SQL statement for various query execution. Other featuresEconomic, Reflects organizational structure.A high degree of function, extensibility and openness of the federation, optimized performance.

CONCLUSION/POSITION the disadvantages of distributed DBMS are complexity, economic, difficulty to maintain data integration, database access [3]. federated database system provides transparency, autonomy, optimized performance, accessibility, and query standard through multiple DBMSs an efficient way to integrate multiple DMBSs if enterprises merging or using different DBMSs, and provide data sharing and processing efficiently throughout the enterprises.

R EFERENCE [1] I. Wijegunaratne, G. Fernandez, J. Valtoudis “A Federated Architecture for Enterprise Data Integration”, 2000 Australian Software Engineering Conference. Retrieved September 12, ( l=GUIDE&CFID= &CFTOKEN= ) l=GUIDE&CFID= &CFTOKEN= [2] Laura Haas, Eileen Lin, 2002 “IBM Federated Database Technology”, IBM, retrieved September 10, 2007 ( ) [3] M. Özsu and P. Valduriez, Principles of Distributed Database Systems, 2nd edition (1st edition 1991), New Jersey, Prentice-Hall, [4] F.A. Baião, M. Mattoso, G. Zaverucha “Towards an Inductive Design of Distributed Object Oriented Databases”. Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems, p , August Retrieved September 28, 2007 from [5] F. Baião, M. Mattoso, G. Zaverucha. “An Algorithm for the Design of Distributed Object Databases” PowerPoint. Retrieved September 14, From db.cs.wisc.edu/dbseminar/spring00/talks/fernanda_slides.pdf. db.cs.wisc.edu/dbseminar/spring00/talks/fernanda_slides.pdf [6] L.M. Haas, E.T. Lin, M.A. Roth “Data integration through database federation ”. IBM Systems Journal, Volume 41, Issue 4, retrieved October 1, 2007 from

QUESTION?