Distributed Database Systems Dr. Mohamed Osman Hegazi.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

Distributed Database Systems
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
CS 347Notes 021 CS 347: Parallel and Distributed Data Management Notes02: Distributed DB Design Hector Garcia-Molina.
Manajemen Basis Data Pertemuan 9 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Distributed DBMS© 2001 M. Tamer Özsu & Patrick Valduriez Page 1.1 Outline  Introduction à What is a distributed DBMS à Problems à Current state-of-affairs.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Distributed Database Management Systems
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
Distributed DBMSPage 5. 1 © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture  Distributed Database.
Chapter 12 Distributed Database Management Systems
Distributed DBMS© 2001 M. Tamer Özsu & Patrick Valduriez Page 1.1 Outline  Introduction à What is a distributed DBMS à Problems à Current state-of-affairs.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
1 Distributed Databases CS347 Lecture 13 May 23, 2001.
Distributed Databases
DISTRIBUTED DATABASE MANAGEMENT SYSTEM CHAPTER 07.
Outline Introduction Background Distributed Database Design
Distributed databases
Distributed Databases and DBMSs: Concepts and Design
PMIT-6103 Advanced Database Systems
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
DISTRIBUTED DATABASE DESIGN
Session-9 Data Management for Decision Support
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed Database Systems Overview
PMIT-6102 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Distributed Databases Midterm review. Lectures covered Everything until (including) March 2 nd Everything until (including) March 2 nd Focus on distributed.
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
The Evolution of Distributed DBMS 4Social and Technical Changes in the 1980’s u Business operations became more decentralized geographically. u Competition.
DDBMS Distributed Database Management Systems Fragmentation
Distributed Database Systems INF413. Distributed Database Management Systems, SAEED K. RAHIMI FRANK S. HAUG Course Books 2.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
Chapter 12 Distributed Database Management Systems.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
CS742 – Distributed & Parallel DBMSM. Tamer Özsu Page 1.1 Outline Introduction & architectural issues What is a distributed DBMS Problems Current state-of-affairs.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 9: Fragmentation and Distributed Query Processing Professor Chen Li.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Distributed DBMSs – Concepts and Design Chapter 24 in Textbook.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
CS742 – Distributed & Parallel DBMSPage 2. 1M. Tamer Özsu Outline Introduction & architectural issues  Data distribution  Fragmentation  Data Allocation.
Distributed Database Management Systems
DISTRIBUTED DATABASE ARCHITECTURE
Outline Introduction Background Distributed DBMS Architecture
Distributed Database Management Systems
Presentation transcript:

Distributed Database Systems Dr. Mohamed Osman Hegazi

Definitions: Distributed Database : is a collection of multiple logically interrelated databases distributed over a computer network. Distributed database management systems (DDBMS): The software that permits the management of DDBS and makes the distribution transparent to the users. Distributed database system (DDBS) = DDB + D–DBMS The two important terms in this definitions are: -Logically interrelated. (The Application) -Distributed over a network. Dr. Mohamed Osman Hegazi

1.The development of computer network promotes de-centralization 2.In a company, the database organization might reflect the organizational structure, which is distributed into units. Each unit maintains its own database 3.Sharing of data can be achieved by developing a distributed database system which: Makes data accessible by all units Stores data close to where it is most frequently used Motivation for Distributed Database Dr. Mohamed Osman Hegazi

DDBMS Advantages: Data are located near “greatest demand” site Faster data access Faster data processing Growth facilitation Improved communications Reduced operating costs User-friendly interface Less danger of a single-point failure Processor independence Dr. Mohamed Osman Hegazi

DDBMS Disadvantages Complexity of management and control Security Lack of standards Increased storage requirements Greater difficulty in managing the data environment Increased training cost Dr. Mohamed Osman Hegazi

The concept of DDB: A DDBS is not a collection of files that can be individually stored at each node of computer network. To form a DDBS, files should not only be logically related, but there should be structure among the files, and access should be via a common interface. Dr. Mohamed Osman Hegazi

Distributed Database Management Systems Dr. Mohamed Osman Hegazi

An Example EMP(ENO, ENAME, TITLE) ASG(ENO, PNO, DUR, RESP) PROJ(PNO, PNAME, BUDGET) PAY(TITLE,SAL) Dr. Mohamed Osman Hegazi

Distributed Query If these table is stored in one place then we can “for example” using the following query to get the name and the salary of the employee who works more than 12 months. SELECTENAME, SAL FROMEMP, ASG, PAY WHEREASG. DUR >12 ANDEMP.ENO=ASG.ENO ANDPAY.TITLE=EMP.TITLE But if these table are distributed over deferent site then the execution of this query needs allot of process to be done, DDMS do this process and let the end user feel like database’s only user (transparence) Dr. Mohamed Osman Hegazi

The concepts of DDB is to fragment the data and store each fragment on its site. Data may be replicated on different site (replication) DDBMS hide these details from the user and makes the distribution transparent to the users. Distributed Database Transparency Features Distribution transparency Transaction transparency Failure transparency Performance transparency Heterogeneity transparency Distributed Database Transparency Dr. Mohamed Osman Hegazi

Distributed DB Design Top-down approach: have a database how to split and allocate to individual sites Two issues in top-down design Fragmentation Allocation Multi-databases (or bottom-up): combine existing databases how to deal with heterogeneity & autonomy Dr. Mohamed Osman Hegazi

Fragmentation Horizontal Primary depends on local attributes R Derived depends on foreign relation Vertical R Dr. Mohamed Osman Hegazi

Example Employee relation E (#,name,loc,sal,…) 40% of queries: 40% of queries: Qa: select * Qb: select * from E where loc=Sa where loc=Sb and… and... Motivation: Two sites: Sa, Sb Qa   Qb SaSb Dr. Mohamed Osman Hegazi

# Name Loc Sal Sa10 SallySb25 TomSa15 Joe 5 8 Sa10 TomSa15 Joe7Sb25Sally.. F = {F 1,F 2 } At Sa At Sb E F 1 =  loc=Sa (E)F 2 =  loc=Sb (E)  primary horizontal fragmentation Dr. Mohamed Osman Hegazi

Loc=S A  sal < 10 Loc=S A  sal  10 Loc=S B  sal < 10 Loc=S B  sal  10 F1F1 F3F3 F2F2 Q a : Select … loc = S A... Q b : Select … loc = S B... Prefer F 2 to F 1 and F 3 Dr. Mohamed Osman Hegazi

Horizontal Fragmentation : Peer to peer relationship – brothers Dr. Mohamed Osman Hegazi

Vertical fragmentation E1E1 E E2E2 Example: R[T]  R 1 [T 1 ], R 2 [T 2 ],…, R n [T n ] T i  T  Just like normalization of relations Dr. Mohamed Osman Hegazi

Vertical Fragmentation example PROJ 1 :information about project budgets PROJ 2 :information about project names and locations PNOBUDGET P P P P P PNOPNAMELOC P1InstrumentationMontreal P3CAD/CAMNew York P2Database Develop.New York P4MaintenanceParis P5CAD/CAMBoston PROJ 1 PROJ 2 New York PROJ PNOPNAMEBUDGETLOC P1Instrumentation150000Montreal P3CAD/CAM P2Database Develop P4Maintenance310000Paris P5CAD/CAM500000Boston New York Dr. Mohamed Osman Hegazi

E 1 (#,NM,LOC) E 2 (#,SAL) Example: E(#,NM,LOC,SAL)E 1 (#,NM) E 2 (#,LOC) E 3 (#,SAL) Which is the right vertical fragmentation? ….. Grouping Attributes Dr. Mohamed Osman Hegazi

Vertical Fragmentation : branch relationship – parents and son Dr. Mohamed Osman Hegazi

Hybrid Fragmentation R HF R1R1 VF R 11 R 12 R 21 R 22 R 23 R2R2 Dr. Mohamed Osman Hegazi

Allocation Example: E  F 1 =  loc=Sa (E); F 2 =  loc=Sb (E) Site a Site b Fragment E Do we replicate fragments? Where do we place each copy of each fragment? Site c F1F1 F1F1 F2F2 Dr. Mohamed Osman Hegazi

Allocation Alternatives Non-replicated – partitioned : each fragment resides at only one site Replicated – fully replicated : each fragment at each site – partially replicated : each fragment at some of the sites Rule : If replication is advantageous, otherwise replication may cause problems read - only queries update queries  1 Dr. Mohamed Osman Hegazi

Optimization problem What is the best placement of fragments and/or best number of copies to: – minimize query response time – maximize throughput – minimize “some cost” –... Subject to constraints – Available storage – Available bandwidth, processing power,… – Keep 90% of response time below X –... Very hard problem Dr. Mohamed Osman Hegazi

Replication Replication is to store copies of the same data in more than one location (site) and then these copies must be consistency updated "Despite the distance from each other" Controlling the updating of these copies is done by one of two techniques: Lazy replication: it is to update the data after the completion of work on one of the copies (master copy). This means that update is done outside the boundaries of transaction Eager replication: is to update the replicated data within the transaction boundaries while working on one of the copies. – central update(initial copy primary copy): update the primary copy first and then update the secondary copy. This method leads to lack of synchronization of the update, which facilitates control of consistency, but may lead to the problems of the bottleneck – Or update everywhere: ​​updating the copies in all places make all the copies of equal opportunities for the update. Dr. Mohamed Osman Hegazi