Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

1 Course overview grouped by System integration software types: 1.Homogeneous ERP systems integrated by using a common database. 2.Heterogeneous ERP systems.
Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different.
Why should a database transaction be atomic?. ABORT = Removal of the updates of a transaction An abort is implemented by a roll back recovery where the.
Lars Frank: 1971 Cand. Scient. in computer science (Datalog) and math HD in organization Database consultant (primært i banksektoren) 1994-
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Lecture-1 Database system,CSE-313, P.B. Dr. M. A. Kashem Associate. Professor. CSE, DUET,
Enterprise Systems Distributed databases and systems - DT
Mecanismos de alta disponibilidad con Microsoft SQL Server 2008 Por: ISC Lenin López Fernández de Lara.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Book of Lars Frank, Chapter 10, SCD (Slowly Changing Dimensions): The hidden slides of this slideshow may be important. However, I will focus on leaning.
Storing Organizational Information—Databases
The ACID properties of transactions: Atomicity = the all or nothing update property. Consistency = if a database is consistent before a transaction is.
Chapter 13 (Web): Distributed Databases
1 Countermeasures against Consistency Anomalies in Databases with Relaxed ACID Properties. By Lars Frank Copenhagen Business School.
Principles and Learning Objectives
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Database Management System (DBMS)
Distributed Database Management Systems
Ch1: File Systems and Databases Hachim Haddouti
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
McGraw-Hill/Irwin © 2008 The McGraw-Hill Companies, All Rights Reserved Chapter 7 Storing Organizational Information - Databases.
ENTERPRISE SOFTWARE.
1 Recap Database: –collection of data central to some enterprise that is managed by a Database Management System –reflection of the current state of the.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
Distributed Databases
How can ERP improve a company’s business performance?  Prior to ERP systems, companies stored important business records in many different departments.
Client/Server Databases and the Oracle 10g Relational Database
Module Title? DBMS Introduction to Database Management System.
Database Design – Lecture 16
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Database Management System Module 5 DeSiaMorewww.desiamore.com/ifm1.
Session-9 Data Management for Decision Support
Why should a database transaction be atomic?. ABORT = Removal of the updates of a transaction An abort is implemented by a DBMS roll back recovery where.
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Overview – Chapter 11 SQL 710 Overview of Replication
Unit 9 Transaction Processing. Key Concepts Distributed databases and DDBMS Distributed database advantages. Distributed database disadvantages Using.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
G063 - Distributed Databases. Learning Objectives: By the end of this topic you should be able to: explain how databases may be stored in more than one.
1 Distributed Databases BUAD/American University Distributed Databases.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
Real-time Databases Presented by Parimala kyathsandra CSE 666 fall 2006 Instructor Prof. Subra ganesan.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
1 Lecture 10: Distributed Databases – Replication and Fragmentation Advanced Databases CG096 Nick Rossiter.
1 Lecture 8 Distributed Data Bases: Replication and Fragmentation.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Lecture on Database Management System
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
DISTRIBUTED DATABASES AND DDBMS. Learning Objectives  Describe various DDBMS implementations  Explain how database design affects the DDBMS environment.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Introduction To DBMS.
Managing Multi-User Databases
Chapter 1: Introduction
Chapter 1: Introduction
Database Management System
Distributed DBMS Concepts of Distributed DBMS
ENTERPRISE BUSINESS SYSTEMS
G063 - Distributed Databases
A View over Distributed databases
Database Security Transactions
Terms: Data: Database: Database Management System: INTRODUCTION
Distributed Databases
Presentation transcript:

Properties of Data Replication : Replication can increase read performance Replication can be used to integrate heterogeneous systems that uses different databases. Replication can be used for both failure and disaster recovery The costs of using replication is decreased update performance and the costs of managing consistency problems. (Relaxed ACID properties may be used in order to minimize inconsistency).

Cases where replication designs and extended transactions are recommended: ERP systems E-commerce Electronic health records (Elektronisk patientjournal) Logistics Airline reservation systems CSCW systemer Distributed calender systems Mobile integrated database applications Supply chain management Banking systems Library systems

Overview of different types of replication designs: Suppose all replication designs have n physical copies of the same logical table. In the n-safe replication design, all n copies are consistent and up to date. In the quorum-safe design, a quorum of the n copies are consistent and up to date. In the 1-safe design, only one of the n copies is consistent and up to date. In the 0-safe design, none of the copies are consistent and up to date. The inconsistencies can be managed by using countermeasures.

Rules for using Data Replication in Distributed Databases with Relaxed ACID Properties 1.Use only replicated data when it is necessary for availability or economical reasons. 2.The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. 3.The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates must wait until the primary copy location has committed the updates. Please notice that the basic 1-safe designs is the cheapest replication method.

Design Rules for Replicating Data in Distributed Databases with Relaxed ACID Properties 5.The quorum-safe design is not recommended as it is only used in DDBMS (Distributed Database Management Systems). 6.The n-safe design and especially the 2-safe design are only recommended in practice when the are implemented in hardware or managed by the operative system. DDBMSs (Distributed Database Management System) can also manage the n-safe designs but DDBMSs are not used in practice as they are too complex.

The Basic 1-safe Design: The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates. The basic 1-safe designs is the cheapest replication method.

Suppose the ERP system and B2B E-commerce system are heterogeneous. Which ERP tables would you recommend to replicate by using the Basic 1-safe design?

Replication Example: The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days. In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0- safe design with local commit. What replication design would you recommend for the Medicine types? Medicine prescriptions Medicine types

The basic 1-safe Design The 1-safe Design with commutative updates In the basic 1-safe replication design, lost transaction may occur when the secondary location takes over after a primary copy failure. Why can lost transactions not occur in the 1-safe design with commutative updates?

The Basic 1-safe Design Example: The basic 1-safe design is recommended when it does not make major problems if updates are delayed for hours or even days. Example: Which of the 1-safe designs would you recommend for the Disease types table? Where would you recommend to store the back up of this table?

Entities with different versions of 0-safe design Which of the 1-safe designs would you recommend?

The 0-safe design with local commit: The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. Example.

The 0-safe design with deferred commit: The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. Example.

0-safe Design Example: In the “Danish medicine card” system, all hospitals and private physicians must transfer their medicine prescriptions to a central database by using the 0- safe design with local commit. Which type of 0-safe replication design would you recommend for the Medicine prescriptions? Medicine prescriptions Medicine types

Where would you recommend the different types of 0-safe design? Entities with the 1-safe design

Electronic Health Records Deffered commitLocal commit 3.The 0-safe design with local commit is recommended when it is important to operate in disconnected mode, and it is possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 0-safe design with deferred commit is recommended when it is important to operate in disconnected mode, and when it is not possible to implement sufficient local countermeasures against the isolation anomalies.

Suppose the ERP system and B2B E-commerce system are heterogeneous. Which tables would you recommend to replicate by using one the 0-safe designs?

The basic N-safe Design The basic 1-safe Design The basic 0-safe Design

Evaluation of replication designs

Electronic Health Records (EHR)

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Overview of the most important EHR integr. architectures

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worstAverageAbove worst

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central solution with central sub- scription and SOA services to others Average

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Software costs Anomaly problems 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central solution mixed with distributed subscription on top of central subscription AverageBestAverage

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Which EHR integration architecture would you recommend and why?

Architectures for integrating electronic health records Evaluation criteria Local auto- nomy Read perfor- mance Soft- ware costs Anomaly pro- blems 1. Integration by using SOA health services BestWorst 2. Central database solutionWorstBest 3. Central database solution mixed with SAO integration AverageAbove worst AverageAbove worst 4. Distributed subscriber solutionBest WorstAverage 5. Central subscriber offering SOA services to others BestAverageWorstAverage 6. Central database solution with central subscription and SOA services to others Average 7. Central database solution mixed with distributed subscription on top of central subscription AverageBestAverage Overview of the most important EHR integr. architectures

Concept definitions used in logistic exercise: Pallet = wooden skeleton where packages may be stored in such a way that they all can be moved by a truck. Collie = alle the packages that are stored on a pallet(palle). Leg = Route or subroute where the transportation does not have stops

ER-diagram of a logistics management system Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages and Collies Loca- tions Route-leg hierarchy Package- Collie hierarchy from to Routes and legs from to Damage relationship from to Container -routes Transport operator How should the transport orders and sub-orders be replicated in order to optimize the transports?

Global ER-diagram of integrated logistics management Orders Customers Transport medias like ships, airplanes, and trucks. Physical containers Scheduled routes and legs Orderlines Packages, Collies and Containers Loca- tions Route-leg hierarchy Package- Collie hierarchy from to Routes and legs from to Damage relationship from to Container -routes Transport operator Describe the local databases in the central location of the transport company, the locations of the integrated transport suppliers, and the mobile locations of the Transport medias. Design a workflow with focus on the integration of the local database locations.

Evaluation overview of operating system replication methods Evaluation criteriaOperating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Read performance/ capacity AverageBestAverageBest Write performanceAverageAbove worstAverage Ease of failure recovery AverageAverage for roll back recovery AverageNot supported Ease of disaster recovery Below best Not supported The probability of lost data Best p n Worst Availability1-q n 1-q 2 AtomicityBestNot supportedBestNot supported ConsistencyBestNot supportedBestNot supported IsolationBestNot supportedBestNot supported DurabilityBestNot supportedBestNot supported Development costsBest

Evaluation overview of operating system replication methods Evaluation criteria Operating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Read performance/ capacity AverageBestAverageBest Write performance AverageAbove worstAverage Ease of failure recovery AverageAverage for roll back recovery AverageNot supported Ease of disaster recovery Below best Not supported The probability of lost data Best p n Worst

Evaluation overview of operating system replication methods Evaluation criteria Operating system replication methods Mirroring with disk volume ownership Mirroring without disk volume ownership Remote caching in a fast storage Local caching in the user location Availability1-q n 1-q 2 AtomicityBestNot supportedBestNot supported ConsistencyBestNot supportedBestNot supported IsolationBestNot supportedBestNot supported DurabilityBestNot supportedBestNot supported Development costs Best

Horizontal fragmentation: Global table Fragment 1 Fragment 2 Fragments = The non-redundant and non-overlapping parts of a global distributed table. Fragments may be allocated in many different locations.

Vertikal Fragmentation: Example: In an employ table some attribytes/fragments may be confidential and stored in a secure location.

Fragmentation rules: Vertical fragmentation supports distribution by function where different functions use different attributes. Horizintal fragmentation supports geografical distribution where different locations use different rows.

Mixed Fragmentation: Horizontal fragmentation on a vertical fragmentation.

Allokation of fragments: Allocation is the physical placement of fragments in different locations. Allocated fragments may be redundant.

DDBMS fragmentation and allocation: Physical tables. Fragments Global Tabel. = Physical table in location ”i” = Fragment ”j” = Fragment ”j” in location ”i” Example with 3 fragments and 3 locations:

Entities with different versions of 0-safe design Entities with 1-safe design Describe your recommendations for distributed table fragmentation?

A distributed ERP system = A set of local ERP systems integrated in such a way that each local system can use the resources/stocks managed by the other local ERP systems. Would you recommend the distributed ERP architecture for a mobile salesman?

Properties of Distributed ERP systems: The stocks of other locations may be used in case a product is sold out in its local location. That is, a mobile ERP system without its own stocks may use any stock location. An E-comers location may also be viewed as an ERP location without its own stocks. In this situation the users are there own ERP salesmen. The development costs of such an E- commerce software product are reduced to almost nothing. The extra costs for developing a distributed ERP system is also very low if short duration locking is used anyway. Migration to a new ERP version is more flexible as the converting process do not need to take place overnight. (The reason why is that the stocks of the old ERP system may be used by the new ERP system until the converting process is over).

Which tables in Distributed ERP system would you recommend to replicate? 1.Use only replicated data when it is necessary or convenient for economical reasons. 2.The 0-safe design with local commit is recommended when it is important to update in disconnected mode and it is possible to implement sufficient local countermeasures against the isolation anomalies. 3.The 0-safe design with deferred commit is recommended when it is important to update in disconnected mode and it is not possible to implement sufficient local countermeasures against the isolation anomalies. 4.The 1-safe designs are recommended in situations when real time update is not important in disconnected mode. Therefore, updates may wait until the primary copy location has committed the updates. Please notice that the basic 1-safe designs is the cheapest replication method.

Allokation of fragments: Is it possible to optimize the Product and Product stock tables by integrating them to a single table locally? Allocation is the physical placement of fragments in different locations. Allocated fragments may be replicated. Exercise: How are the tables fragmented and allocated in a distributed ERP system?

Replication in a distributed ERP system: Orders are fragmented and without replication Fragmented and 0-safe with primary copy commit. Fragmented and without replication Not fragmented and basic 1-safe. Local customers are fragmented and has the basic 1-safe design. Global customers are fragmented and has the 0-safe with primary copy commit. Should all attributes in a table have the same replication design? (Analyze the Orderlines or Products ).

Dynamic fragmentation and allokation: Dynamis allocation is the physical placement of dynamic created fragments. Exercise: What should happen before any sale, if a customer goes to a new sales location?

A distributed modular ERP system have relaxed ACID properties across the autonomous databases of the ERP modules.

If ERP modules only use services from each other then they can be migrated/exchanged individually. That is the ERP production module may be exchanged with hospitals service modules and the customers may be called patients. The Account, Procurement, and HRM modules may be used unchanged. The CRM module should change name to EHR (Electronic Helth Records) as EHR is more complex than CRM.

A data model for an integrated E- commerce/ERP system: Exersice. Suppose a distributed modular ERP system has Accounting, Sale, Procurement, and CRM modules. What tables/table fragments should these modules own?

Exercise: Describe and design the local databases for a distributed brewery with many different production, sale and depot locations. Can the earlier described distributed ERP system be used?

Design a Distributed Airline Database Design an integrated distributed database that integrate the databases of different airline companies in a way that optimize performance, availability and consistency of a common distributed airline system with local databases in the airline companies, airports, and “sale offices” at e.g. travel agents, hotels and e-commerce servers.

Exercise: Design an integrated distributed database that integrate the databases of different airline companies and hotel chains in a way that optimize performance, availability and consistency of a common distributed system with local databases in the airline companies, hotel chains, airports, and “sale offices” at e.g. travel agents, hotels and e-commerce servers.

End of session Thank you !!!

The Basic 1-safe Design The Basic 0-safe Design The x-safe replication design: Suppose n is the number of replicated tables and x is an integer in the interval [0,n] where n is an integer greater than one. In the x-safe design x out of n replicated tables are consistent and and up to date.

Countermeasures against missing isolation property

Properties of different EHR database design methods Evaluation criteriaDatabase designs for EHR systems. Traditional normalized database design XML based storing of variable health attributes Generalized subtypes are used for storing variable health attributes Flexibility towards new health record types WorstBest Performance of overview queries WorstBest Performance of queries that need variable health attributes BestAverageWorst Storage consumptionBestAverageWorst Development costs for table driven applications WorstBest Flexibility towards data analyses AverageWorstBest Is normalization used?YesNo

EHR Datawarehouse:

Properties of different EHR database design methods Evaluation criteriaDatabase designs for EHR systems. Traditional normalized database design XML based storing of variable health attributes Generalized subtypes are used for storing variable health attributes Flexibility towards new health record types WorstBest Performance of overview queries WorstBest Performance of queries that need variable health attributes BestAverageWorst Storage consumptionBestAverageWorst Development costs for table driven applications WorstBest Flexibility towards data analyses AverageWorstBest Is normalization used?YesNo

”Health replication” methods