Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different.

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

1 Course overview grouped by System integration software types: 1.Homogeneous ERP systems integrated by using a common database. 2.Heterogeneous ERP systems.
Why should a database transaction be atomic?. ABORT = Removal of the updates of a transaction An abort is implemented by a roll back recovery where the.
Lars Frank: 1971 Cand. Scient. in computer science (Datalog) and math HD in organization Database consultant (primært i banksektoren) 1994-
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Lecture-1 Database system,CSE-313, P.B. Dr. M. A. Kashem Associate. Professor. CSE, DUET,
Enterprise Systems Distributed databases and systems - DT
Mecanismos de alta disponibilidad con Microsoft SQL Server 2008 Por: ISC Lenin López Fernández de Lara.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Book of Lars Frank, Chapter 10, SCD (Slowly Changing Dimensions): The hidden slides of this slideshow may be important. However, I will focus on leaning.
Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different.
The ACID properties of transactions: Atomicity = the all or nothing update property. Consistency = if a database is consistent before a transaction is.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
1 Countermeasures against Consistency Anomalies in Databases with Relaxed ACID Properties. By Lars Frank Copenhagen Business School.
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
9/6/2001Database Management – Fall 2000 – R. Larson Information Systems Planning and the Database Design Process University of California, Berkeley School.
Database Management System (DBMS)
Overview Distributed vs. decentralized Why distributed databases
Ch1: File Systems and Databases Hachim Haddouti
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
Distributed Databases
How can ERP improve a company’s business performance?  Prior to ERP systems, companies stored important business records in many different departments.
Client/Server Databases and the Oracle 10g Relational Database
Etour is an integrated information system that aims at collecting, organizing, managing, distributing and selling services of a tourist enterprise and.
Module Title? DBMS Introduction to Database Management System.
Distributed DBMSs - Concepts and Design Transparencies
Database Design – Lecture 16
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Database Management System Module 5 DeSiaMorewww.desiamore.com/ifm1.
Session-9 Data Management for Decision Support
Why should a database transaction be atomic?. ABORT = Removal of the updates of a transaction An abort is implemented by a DBMS roll back recovery where.
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Overview – Chapter 11 SQL 710 Overview of Replication
Unit 9 Transaction Processing. Key Concepts Distributed databases and DDBMS Distributed database advantages. Distributed database disadvantages Using.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
G063 - Distributed Databases. Learning Objectives: By the end of this topic you should be able to: explain how databases may be stored in more than one.
1 Distributed Databases BUAD/American University Distributed Databases.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
Computer Science and Engineering Computer System Security CSE 5339/7339 Session 21 November 2, 2004.
Real-time Databases Presented by Parimala kyathsandra CSE 666 fall 2006 Instructor Prof. Subra ganesan.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
Chapter 17: Additional Slides February 6, Outline Physical Data Management  Fragments  Distributed Query Processing  Transactions Logical Data.
Distributed DBMS, Query Processing and Optimization
1 Lecture 8 Distributed Data Bases: Replication and Fragmentation.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Lecture on Database Management System
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Introduction To DBMS.
Managing Multi-User Databases
ITEC 3220A Using and Designing Database Systems
Chapter 1: Introduction
Chapter 1: Introduction
Physical Database Design
Distributed Database Management Systems
Distributed DBMS Concepts of Distributed DBMS
Distributed Databases
Database Security Transactions
Terms: Data: Database: Database Management System: INTRODUCTION
Distributed Databases
Presentation transcript:

Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different databases. Replication can be used for both failure and disaster recovery The costs of using replication is decreased update performance and the costs of managing consistency problems. (Relaxed ACID properties may be used in order to minimize inconsistency).

Cases where replication designs and extended transactions are recommended: ERP systems E-commerce Electronic health records (Elektronisk patientjournal) Logistics Airline reservation systems CSCW systemer Distributed calender systems Mobile integrated database applications Supply chain management Banking systems Library systems

Overview of different types of replication designs: Suppose all replication designs have n physical copies of the same logical table. In the n-safe replication design, all n copies are consistent and up to date. In the quorum-safe design, a quorum of the n copies are consistent and up to date. In the 1-safe design, only one of the n copies is consistent and up to date. In the 0-safe design, none of the copies are consistent and up to date. The inconsistencies can be managed by using countermeasures.

Rules for using Data Replication in Distributed Databases with Relaxed ACID Properties 1.Use only replicated data when it is necessary for availability or economical reasons. 2.The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. 3.The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates must wait until the primary copy location has committed the updates. Please notice that the basic 1-safe designs is the cheapest replication method.

Design Rules for Replicating Data in Distributed Databases with Relaxed ACID Properties 5.The quorum-safe design is not recommended as it is only used in DDBMS (Distributed Database Management Systems). 6.The n-safe design and especially the 2-safe design are only recommended in practice when the are implemented in hardware or managed by the operative system. DDBMSs (Distributed Database Management System) can also manage the n-safe designs but DDBMSs are not used in practice as they are too complex.

The Basic 1-safe Design: The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates. The basic 1-safe designs is the cheapest replication method.

Suppose the ERP system and B2B E-commerce system are heterogeneous. Which ERP tables would you recommend to replicate by using the Basic 1-safe design?

Replication Example: The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days. In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0- safe design with local commit. What replication design would you recommend for the Medicine types? Medicine prescriptions Medicine types

The basic 1-safe Design The 1-safe Design with commutative updates In the basic 1-safe replication design, lost transaction may occur when the secondary location takes over after a primary copy failure. Why can lost transactions not occur in the 1-safe design with commutative updates?

The Basic 1-safe Design Example: The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days. Example: Which of the 1-safe designs would you recommend for the Disease types table? Where would you recommend to store the back up of this table?

Entities with different versions of 0-safe design Which of the 1-safe designs would you recommend?

The 0-safe design with local commit: The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. Example.

The 0-safe design with deferred commit: The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. Example.

0-safe Design Example: In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0- safe design with local commit. Which type of 0-safe replication design would you recommend for the Medicine prescriptions? Medicine prescriptions Medicine types

Where would you recommend the different types of 0-safe design? Entities with the 1-safe design

Electronic Health Records Deffered commitLocal commit 3.The 0-safe design with local commit is recommended when it is important to operate in disconnected mode, and it is possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 0-safe design with deferred commit is recommended when it is important to operate in disconnected mode, and when it is not possible to implement sufficient local countermeasures against the isolation anomalies.

Suppose the ERP system and B2B E-commerce system are heterogeneous. Which tables would you recommend to replicate by using one the 0-safe designs?

The basic N-safe Design The basic 1-safe Design The basic 0-safe Design

Evaluation of replication designs

Electronic Health Records (EHR)

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Overview of the most important EHR integr. architectures

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central solution mixed with distributed subscription on top of central subscription AverageBestAverage

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Which EHR integration architecture would you recommend and why?

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Overview of the most important EHR integr. architectures

Concept definitions used in logistic exercise: Pallet = wooden skeleton where packages may be stored in such a way that they all can be moved by a truck. Collie = alle the packages that are stored on a pallet(palle). Leg = Route or subroute where the transportation does not have stops

ER-diagram of a logistics management system Transport Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages, Collies and Containers Locations Route-leg hierarchy Package- Collie hierarchy Routes and legs from to Damage relationship from to Container-routes relationships of order 3 Transport operator How should the transport orders and sub-orders to sub-contractors be replicated in order to optimize the transports?

Transport Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages, Collies and Containers Locations Route-leg hierarchy Package- Collie hierarchy Routes and legs from to Damage relationship from to Global ER-diagram of integrated logistics management Container-routes relationships of order 3 Transport operator Describe the local databases in the central location of the transport company, the locations of the sub-contractors, and the mobile locations of the Transport medias. Design a workflow with focus on the integration of the local database locations.

Petri net: Work flow of a global E-commerce transactions where the stocks are in the locations of the different suppliers. OR split

Sub- Petri net of activity 2 May the suppliers be transport sub-contractors? AND split OR split AND join

Transport Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages, Collies and Containers Locations Route-leg hierarchy Package- Collie hierarchy Routes and legs from to Damage relationship from to Global ER-diagram of integrated logistics management Container-routes relationships of order 3 Transport operator How would you recommend to integrate and later merge the shipping companies Maersk and P&O Nedlloyd if Maersk had used the logistics architecture above?

Evaluation overview of operating system replication methods Evaluation criteriaOperating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Read performance/ capacity AverageBestAverageBest Write performanceAverageAbove worstAverage Ease of failure recovery AverageAverage for roll back recovery AverageNot supported Ease of disaster recovery Below best Not supported The probability of lost data Best p n Worst Availability1-q n 1-q 2 AtomicityBestNot supportedBestNot supported ConsistencyBestNot supportedBestNot supported IsolationBestNot supportedBestNot supported DurabilityBestNot supportedBestNot supported Development costsBest

Evaluation overview of operating system replication methods Evaluation criteria Operating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Read performance/ capacity AverageBestAverageBest Write performance AverageAbove worstAverage Ease of failure recovery AverageAverage for roll back recovery AverageNot supported Ease of disaster recovery Below best Not supported The probability of lost data Best p n Worst

Evaluation overview of operating system replication methods Evaluation criteria Operating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Availability1-q n 1-q 2 AtomicityBestNot supportedBestNot supported ConsistencyBestNot supportedBestNot supported IsolationBestNot supportedBestNot supported DurabilityBestNot supportedBestNot supported Development costs Best

Horizontal fragmentation: Global table Fragment 1 Fragment 2 Fragments = The non-redundant and non-overlapping parts of a global distributed table. Fragments may be allocated in many different locations.

Vertikal Fragmentation: Example: In an employ table some attribytes/fragments may be confidential and stored in a secure location.

Fragmentation rules: Vertical fragmentation supports distribution by function where different functions use different attributes. Horizintal fragmentation supports geografical distribution where different locations use different rows.

Mixed Fragmentation: Horizontal fragmentation on a vertical fragmentation.

Allokation of fragments: Allocation is the physical placement of fragments in different locations. Allocated fragments may be redundant.

DDBMS fragmentation and allocation: Physical tables. Fragments Global Tabel. = Physical table in location ”i” = Fragment ”j” = Fragment ”j” in location ”i” Example with 3 fragments and 3 locations:

Entities with different versions of 0-safe design Entities with 1-safe design Describe your recommendations for distributed table fragmentation?

A distributed ERP system = A set of local ERP systems integrated in such a way that each local system can use the resources/stocks managed by the other local ERP systems. Would you recommend the distributed ERP architecture for a mobile salesman?

Properties of Distributed ERP systems: The stocks of other locations may be used in case a product is sold out in its local location. That is, a mobile ERP system without its own stocks may use any stock location. An E-comers location may also be viewed as an ERP location without its own stocks. In this situation the users are there own ERP salesmen. The development costs of such an E- commerce software product are reduced to almost nothing. The extra costs for developing a distributed ERP system is also very low if short duration locking is used anyway. Migration to a new ERP version is more flexible as the converting process do not need to take place overnight. (The reason why is that the stocks of the old ERP system may be used by the new ERP system until the converting process is over).

Which tables in Distributed ERP system would you recommend to replicate? 1.Use only replicated data when it is necessary or convenient for economical reasons. 2.The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. 3.The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates. Please notice that the basic 1-safe designs is the cheapest replication method.

Should a traveling salesman have the primary copy property of the customers that are going to be visited?

Allokation of fragments: Is it possible to optimize the Product and Product stock tables by integrating them to a single table locally? Allocation is the physical placement of fragments in different locations. Allocated fragments may be replicated. Exercise: How are the tables fragmented and allocated in a distributed ERP system?

Replication in a distributed ERP system: Orders are fragmented and without replication Fragmented and 0-safe with primary copy commit. Fragmented and without replication Not fragmented and basic 1-safe. Local customers are fragmented and has the basic 1-safe design. Global customers are fragmented and has the 0-safe with primary copy commit. Should all attributes in a table have the same replication design? (Analyze the Orderlines or Products ).

Dynamic fragmentation and allokation: Dynamis allocation is the physical placement of dynamic created fragments. Exercise: What should happen before any sale, if a customer goes to a new sales location?

A distributed modular ERP system have relaxed ACID properties across the autonomous databases of the ERP modules.

If ERP modules only use services from each other then they can be migrated/exchanged individually. That is the ERP production module may be exchanged with hospitals service modules and the customers may be called patients. The Account, Procurement, and HRM modules may be used unchanged. The CRM module should change name to EHR (Electronic Helth Records) as EHR is more complex than CRM.

A data model for an integrated E- commerce/ERP system: Exersice. Suppose a distributed modular ERP system has Accounting, Sale, Procurement, and CRM modules. What tables/table fragments should these modules own?

Exercise: Describe and design the local databases for a distributed brewery with many different production, sale and depot locations. Can the earlier described distributed ERP system be used?

Design a Distributed Airline Database Design an integrated distributed database that integrate the databases of different airline companies in a way that optimize performance, availability and consistency of a common distributed airline system with local databases in the airline companies, airports, and “sale offices” at e.g. travel agents, hotels and e-commerce servers.

Exercise: Design the workflow and an integrated distributed database that integrate the databases of different airline companies and hotel chains in a way that optimize performance, availability and consistency of a common distributed system with local databases in the airline companies, hotel chains, airports, car rental companies, and “sale offices” at e.g. travel agents, hotels and e- commerce servers. Describe the workflow of the integrated e-commerce system.

End of session Thank you !!!

Transport Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages, Collies and Containers Locations Route-leg hierarchy Package- Collie hierarchy Routes and legs from to Damage relationship from to Container-routes relationships of order 3 Transport operator

The Basic 1-safe Design The Basic 0-safe Design The x-safe replication design: Suppose n is the number of replicated tables and x is an integer in the interval [0,n] where n is an integer greater than one. In the x-safe design x out of n replicated tables are consistent and and up to date.

Countermeasures against missing isolation property

Properties of different EHR database design methods Evaluation criteriaDatabase designs for EHR systems. Traditional normalized database design XML based storing of variable health attributes Generalized subtypes are used for storing variable health attributes Flexibility towards new health record types WorstBest Performance of overview queries WorstBest Performance of queries that need variable health attributes BestAverageWorst Storage consumptionBestAverageWorst Development costs for table driven applications WorstBest Flexibility towards data analyses AverageWorstBest Is normalization used?YesNo

EHR Datawarehouse:

Properties of different EHR database design methods Evaluation criteriaDatabase designs for EHR systems. Traditional normalized database design XML based storing of variable health attributes Generalized subtypes are used for storing variable health attributes Flexibility towards new health record types WorstBest Performance of overview queries WorstBest Performance of queries that need variable health attributes BestAverageWorst Storage consumptionBestAverageWorst Development costs for table driven applications WorstBest Flexibility towards data analyses AverageWorstBest Is normalization used?YesNo

”Health replication” methods