Enterprise Systems Distributed databases and systems - DT211 4 1.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

Distributed DBMSs – Concepts and Design Chapter 22 in Textbook.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
Chapter 13 (Web): Distributed Databases
Manajemen Basis Data Pertemuan 9 Matakuliah: M0264/Manajemen Basis Data Tahun: 2008.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
Distributed Databases
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Distributed Databases
Distributed databases
DATABASE MANAGEMENT SYSTEMS 2 ANGELITO I. CUNANAN JR.
Distributed Databases
Distributed Database and Replication. Distributed Database A logically interrelated collection of shared data and a description of this data physically.
Distributed Databases and DBMSs: Concepts and Design
Distributed DBMSs - Concepts and Design Transparencies
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Distributed DBMSs - Concepts and Design Transparencies
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
ENTERPRISE PROGRAMMING
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
Session-9 Data Management for Decision Support
Distributed and mobile DBMSs Transparencies. ©Pearson Education 2009 Chapter 16 - Objectives Main concepts of distributed DBMSs (DDBMSs) Differences between.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed systems and Distributed databases design Enterprise systems DT
Distributed Database Systems Overview
DDBMS Distributed Database Management Systems Fragmentation
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
PMIT-6101 Advanced Database Systems By- Jesmin Akhter Assistant Professor, IIT, Jahangirnagar University.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
1 Lecture 10: Distributed Databases – Replication and Fragmentation Advanced Databases CG096 Nick Rossiter.
1 Lecture 8 Distributed Data Bases: Replication and Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Chapter 24 Distributed DBMSs – Concepts and Design Pearson Education © 2014.
Distributed Database Design Bayu Adhi Tama, MTI Fasilkom-Unsri Adapted from Connolly, et al., Database Systems 4 th Edition, Pearson Education Limited,
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Distributed DBMSs – Concepts and Design Chapter 24 in Textbook.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed DBMSs - Concepts and Design
Distributed Database Management Systems
Distributed Databases
Chapter 19: Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
Presentation transcript:

Enterprise Systems Distributed databases and systems - DT

Concepts Distributed Database A logically interrelated collection of shared data (and a description of this data), physically distributed over a computer network. Distributed DBMS Software system that permits the management of the distributed database and makes the distribution transparent to users. 2

Concepts Collection of logically-related shared data. Data split into fragments. Fragments may be replicated. Fragments/replicas allocated to sites. Sites linked by a communications network. Each DBMS participates in at least one global application. 3

Advantages of DDBMSs Reflects organizational structure Improved shareability and local autonomy Improved availability Improved reliability Improved performance 4

Disadvantages of DDBMSs Complexity Cost Security Integrity control more difficult Database design more complex 5

Types of DDBMS Homogeneous DDBMS Heterogeneous DDBMS –Sites may run different DBMS products, with possibly different underlying data models. –Occurs when sites have implemented their own databases and integration is considered later: ad hoc planning. Enterprise resource planning (ERP) is the new approach that attempts to overcome this problem 6

Functions of a DDBMS DDBMS to have at least the functionality of a DBMS. Also must have following functionality: –Distributed query processing. –Extended concurrency control. –Extended recovery services. 7

Distributed Database Design Three key issues: Fragmentation Relation may be divided into a number of sub- relations, which are then distributed. Allocation Each fragment is stored at site with "optimal" distribution (see principles of distribution design). Replication Copy of fragment may be maintained at several sites. 8

Fragmentation Quantitative information (replication) used for may include: –frequency with which an application is run; –site from which an application is run; –performance criteria for transactions and applications. Qualitative information (fragmentation) may include transactions that are executed by application: relations, attributes and tuples. 9

Comparison of Strategies for Data Distribution 10

Correctness of Fragmentation Three correctness rules: Completeness If relation R is decomposed into fragments R 1, R 2,... R n, each data item that can be found in R must appear in at least one fragment. Reconstruction Must be possible to define a relational operation that will reconstruct R from the fragments. Reconstruction for horizontal fragmentation is Union operation and Join for vertical. Disjointness If data item d i appears in fragment R i, then it should not appear in any other fragment.; Exception: vertical fragmentation, where primary key attributes must be repeated to allow reconstruction. For horizontal fragmentation, data item is a tuple (row) For vertical fragmentation, data item is an attribute. 11

Horizontal Fragmentation Consists of a subset of the tuples of a relation. Defined using Selection operation of relational algebra:  p (R) For example: P 1 =  type='House' (PropertyForRent) P 2 =  type='Flat' (PropertyForRent) Result (PNo., St, City, postcode,type,room,rent,ownerno.,staffno., branchno.) This strategy is determined by looking at predicates used by transactions. Reconstruction involves using a union eg R = r1 U r2 12

Vertical Fragmentation Consists of a subset of attributes of a relation. Defined using Projection operation of relational algebra:  a1,...,an (R) For example: S 1 =  staffNo, position, sex, DOB, salary (Staff) S 2 =  staffNo, fName, lName, branchNo (Staff) Determined by establishing affinity of one attribute to another. For vertical fragements reconstruction involves the join operation; Each fragment is disjointed except for the primary key 13

Mixed Fragmentation Consists of a horizontal fragment that is vertically fragmented, or a vertical fragment that is horizontally fragmented. Defined using Selection and Projection operations of relational algebra:  p (  a1,...,an (R)) or  a1,...,an (σ p (R)) 14

Transparencies in a DDBMS Distribution Transparency –Fragmentation Transparency –Location Transparency –Replication Transparency Transaction Transparency –Concurrency Transparency –Failure Transparency 15

Concurrency Transparency All transactions must execute independently and be logically consistent with results obtained if transactions executed one at a time, in some arbitrary serial order. Same fundamental principles as for centralized DBMS. Replication makes concurrency more complex. –If a copy of a replicated data item is updated, update must be propagated to all copies. –However, if one site holding copy is not reachable, then transaction is delayed until site is reachable. 16

Failure Transparency DDBMS must ensure atomicity and durability of global transaction. Means ensuring that sub-transactions of global transaction either all commit or all abort. Thus, DDBMS must synchronize global transaction to ensure that all sub-transactions have completed successfully before recording a final COMMIT for global transaction. Must do this in the presence of site and network failures. 17

Performance Transparency Must consider fragmentation, replication, and allocation schemas. DQP has to decide e.g. : –which fragment to access; –which copy of a fragment to use; –which location to use. 18

Performance Transparency DQP produces execution strategy optimized with respect to some cost function. Typically, costs associated with a distributed request include: –I/O cost; –Communication cost: WAN…. 19

Performance Transparency - Example Property(propNo, city)10000 records in London Client(clientNo,maxPrice) records in Glasgow Viewing(propNo, clientNo) records in London SELECT p.propNo FROM Property p INNER JOIN Client c INNER JOIN Viewing v ON c.clientNo = v.clientNo) ON p.propNo = v.propNo WHERE p.city=‘Aberdeen’ AND c.maxPrice > ; This query selects properties that viewed in aberdeen that have a price greater than £200,

Performance Transparency - Example Assume: Each tuple in each relation is 100 characters long. 10 renters with maximum price greater than £200, viewings for properties in Aberdeen. In addition the data transmission rate is 10,000 characters per sec and there is a 1 sec access delay to send a message. 21

Performance Transparency - Example Derive the following : 22

23 Parallel Data Management The argument goes: –if your main problem is that your queries run too slowly, use more than one machine at a time to make them run faster (Parallel Processing). SMP – All the processors share the same memory and the O.S. runs and schedules tasks on more than one processor without distinction. –in other words, all processors are treated equally in an effort to get the list of jobs done. –However, SMP can suffer from bottleneck problems when all the CPUs attempt to access the same memory at once. MPP - more varied in its design, but essentially consists of multiple processors, each running their own program on their own memory i.e. memory is not shared between processors. –the problem with MPP is to harness all these processors to solve a single problem. –But they do not suffer from bottleneck problems

24 There are two possible solutions dividing up the data: Static and Dynamic Partitioning. –In Static Partitioning you break up the data into a number of sections. Each section is placed on a different processor with its own data storage and memory. The query is then run on each of the processors, and the results combined at the end to give the entire picture. This is like joining a queue in a supermarket. You stay with it until you reach the check-out. –The main problem with Static Partitioning is that you can’t tell how much processing the various sections need. If most of the relevant data is processed by one processor you could end up waiting almost as long as if you didn’t use parallel processing at all. –In Dynamic Partitioning the data is stored in one place, and the data server takes care of splitting the query into multiple tasks, which are allocated to processors as they become available. This is like the single queue in a bank. As a counter position becomes free the person at the head of the queue takes that position –With Dynamic Partitioning the performance improvement can be dramatic, but the partitioning is out of the users hands.

25 Sample type question Fragmentation, replication and allocation are the three important characteristics discuss their importance in relation to distributed databases.