Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.

Slides:



Advertisements
Similar presentations
Database Systems: Design, Implementation, and Management
Advertisements

Distributed Database Management Systems
Database Architectures and the Web
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Chapter 12 (Online): Distributed Databases
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Distributed Database Management Systems
Chapter 9 : Distributed Database.
Overview Distributed vs. decentralized Why distributed databases
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
Chapter 13 (Web): Distributed Databases
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
DISTRIBUTED DATABASE MANAGEMENT SYSTEM CHAPTER 07.
Distributed databases
DATABASE MANAGEMENT SYSTEMS 2 ANGELITO I. CUNANAN JR.
Distributed Databases
Distributed Databases and DBMSs: Concepts and Design
ITEC 3220A Using and Designing Database Systems
Client/Server Databases and the Oracle 10g Relational Database
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
12 1 Chapter 12 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Design – Lecture 16
III. Current Trends: 1 - Distributed DBMSsSlide 1/32 III. Current Trends Part 1: Distributed DBMSs: Concepts and Design Lecture 12 (2 hours) Lecturer:
Database Systems: Design, Implementation, and Management Tenth Edition
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Chapter 10 Distributed Database Management System
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
The Evolution of Distributed DBMS 4Social and Technical Changes in the 1980’s u Business operations became more decentralized geographically. u Competition.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
Chapter 12 Distributed Database Management Systems.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
Chapter 10 Distributed Database Management System
Distributed DB CSE2132 Database Systems Week 12 Lecture Distributed Database.
Distributed database system
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
MBA 664 Database Management Systems Dave Salisbury ( )
© 2011 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 12 (Online): Distributed Databases Modern Database Management 10 th Edition Jeffrey.
Chapter 12 Distributed Data Bases. Learning Objectives What a distributed database management system (DDBMS) is and what its components are How database.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 11 Distributed Databases Modern Database Management 9 th Edition Jeffrey A. Hoffer,
Distributed Databases
Distributed Databases
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Distributed Database Concepts
Table General Guidelines for Better System Performance
Chapter 12 Distributed Database Management Systems
Chapter 19: Distributed Databases
Table General Guidelines for Better System Performance
Distributed Databases
Introduction of Week 14 Return assignment 12-1
Presentation transcript:

Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Chapter 12 Distributed Database Management Systems

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Learning Objectives  In this chapter, the student will learn:  About distributed database management systems (DDBMSs) and their components  How database implementation is affected by different levels of data and process distribution  How transactions are managed in a distributed database environment 2

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Learning Objectives  In this chapter, the student will learn:  How distributed database design draws on data partitioning and replication to balance performance, scalability, and availability  About the trade-offs of implementing a distributed data system 3

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Evolution Database Management Systems  Distributed database management system (DDBMS): Governs storage and processing of logically related data over interconnected computer systems  Data and processing functions are distributed among several sites  Centralized database management system  Required that corporate data be stored in a single central site  Data access provided through dumb terminals 4

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure Centralized Database Management System 5

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Factors Affecting the Centralized Database Systems  Globalization of business operation  Advancement of web-based services  Rapid growth of social and network technologies  Digitization resulting in multiple types of data  Innovative business intelligence through analysis of data 6

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Factors That Aided DDBMS to Cope With Technological Advancement Acceptance of Internet as a platform for business Mobile wireless revolutionUsage of application as a serviceFocus on mobile business intelligence 7

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Desirability of Distributed DBMS Over Centralized DBMS Performance degradation High costs Reliability problems Scalability problems Organizational rigidity 8

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Advantages and Disadvantages of DDBMS Advantages Data are located near greatest demand site Faster data access and processing Growth facilitation Improved communications Reduced operating costs User-friendly interface Less danger of a single- point failure Processor independence Disadvantages Complexity of management and control Technological difficulty Security Lack of standards Increased storage and infrastructure requirements Increased training cost Costs incurred due to the requirement of duplicated infrastructure 9

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distributed Processing and Distributed Databases  Distributed processing: Database’s logical processing is shared among two or more physically independent sites via network Distributed database: Stores logically related database over two or more physically independent sites via computer network  Database fragments: Database composed of many parts in distributed database system 10

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Characteristics of Distributed Management Systems Application interface ValidationTransformation Query optimization MappingI/O interfaceFormattingSecurity Backup and recovery DB administration Concurrency control Transaction management 11

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Functions of Distributed DBMS  Receives the request of an application  Validates analyzes, and decomposes the request  Maps the request  Decomposes request into several I/O operations  Searches and validates data  Ensures consistency, security, and integrity  Validates data for specific conditions  Presents data in required format 12

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure A Fully Distributed Database Management System 13

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. DDBMS Components  Computer workstations or remote devices  Network hardware and software components  Communications media Transaction processor (TP): Software component of a system that requests data  Known as transaction manager (TM) or application processor (AP)  Data processor (DP) or data manager (DM)  Software component on a system that stores and retrieves data from its location 14

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Single-Site Processing, Single-Site Data (SPSD)  Processing is done on a single host computer  Data stored on host computer’s local disk  Processing restricted on end user’s side  DBMS is accessed by dumb terminals 15

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure Single-Site Processing, Single-Site Data (Centralized) 16

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple-Site Processing, Single-Site Data (MPSD)  Multiple processes run on different computers sharing a single data repository  Require network file server running conventional applications  Accessed through LAN  Client/server architecture  Reduces network traffic  Processing is distributed  Supports data at multiple sites 17

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure Multiple-Site Processing, Single-Site Data 18

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple-Site Processing, Single-Site Data (MPSD)  Fully distributed database management system  Support multiple data processors and transaction processors at multiple sites  Classification of DDBMS depending on the level of support for various types of databases  Homogeneous: Integrate multiple instances of same DBMS over a network  Heterogeneous: Integrate different types of DBMSs  Fully heterogeneous: Support different DBMSs, each supporting different data model 19

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Restrictions of DDBMS  Remote access is provided on a read-only basis  Restrictions on the number of remote tables that may be accessed in a single transaction  Restrictions on the number of distinct databases that may be accessed  Restrictions on the database model that may be accessed 20

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distributed Database Transparency Features Distribution transparency Transaction transparency Failure transparency Performance transparency Heterogeneity transparency 21

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distribution Transparency  Allows management of physically dispersed database as if centralized  Levels  Fragmentation transparency  Location transparency  Local mapping transparency 22

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distribution Transparency  Unique fragment: Each row is unique, regardless of the fragment in which it is located  Supported by distributed data dictionary (DDD) or distributed data catalog (DDC)  DDC contains the description of the entire database as seen by the database administrator  Distributed global schema: Common database schema to translate user requests into subqueries 23

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Transaction Transparency  Ensures database transactions will maintain distributed database’s integrity and consistency  Ensures transaction completed only when all database sites involved complete their part  Distributed database systems require complex mechanisms to manage transactions 24

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distributed Requests and Distributed Transactions Single SQL statement accesses data processed by a single remote database processor Remote request Accesses data at single remote site composed of several requests Remote transaction Requests data from several different remote sites on network Distributed transaction Single SQL statement references data at several DP sites Distributed request 25

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distributed Concurrency Control  Concurrency control is important in distributed databases environment  Due to multi-site multiple-process operations that create inconsistencies and deadlocked transactions 26

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure The Effect of Premature COMMIT 27

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Two-Phase Commit Protocol (2PC)  Guarantees if a portion of a transaction operation cannot be committed, all changes made at the other sites will be undone  To maintain a consistent database state  Requires that each DP’s transaction log entry be written before database fragment is updated  DO-UNDO-REDO protocol: Roll transactions back and forward with the help of the system’s transaction log entries 28

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Two-Phase Commit Protocol (2PC)  Write-ahead protocol: Forces the log entry to be written to permanent storage before actual operation takes place  Defines operations between coordinator and subordinates  Phases of implementation  Preparation  The final COMMIT 29

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Performance and Failure Transparency  Performance transparency: Allows a DDBMS to perform as if it were a centralized database  Failure transparency: Ensures the system will operate in case of network failure  Considerations for resolving requests in a distributed data environment  Data distribution  Data replication  Replica transparency: DDBMS’s ability to hide multiple copies of data from the user 30

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Performance and Failure Transparency  Network and node availability  Network latency: delay imposed by the amount of time required for a data packet to make a round trip  Network partitioning: delay imposed when nodes become suddenly unavailable due to a network failure 31

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Distributed Database Design How to partition database into fragments Data fragmentation Which fragments to replicate Data replication Where to locate those fragments and replicas Data allocation 32

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Data Fragmentation  Breaks single object into many segments  Information is stored in distributed data catalog (DDC)  Strategies  Horizontal fragmentation: Division of a relation into subsets (fragments) of tuples (rows)  Vertical fragmentation: Division of a relation into attribute (column) subsets  Mixed fragmentation: Combination of horizontal and vertical strategies 33

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Data Replication  Data copies stored at multiple sites served by a computer network  Mutual consistency rule: Replicated data fragments should be identical  Styles of replication  Push replication  Pull replication  Helps restore lost data 34

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Data Replication Scenarios Stores multiple copies of each database fragment at multiple sites Fully replicated database Stores multiple copies of some database fragments at multiple sites Partially replicated database Stores each database fragment at a single site Unreplicated database 35

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Data Allocation Strategies Entire database stored at one site Centralized data allocation Database is divided into two or more disjoined fragments and stored at two or more sites Partitioned data allocation Copies of one or more database fragments are stored at several sites Replicated data allocation 36

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. The CAP Theorem  CAP stands for:  Consistency  Availability  Partition tolerance  Basically available, soft state, eventually consistent (BASE)  Data changes are not immediate but propagate slowly through the system until all replicas are consistent 37

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Table Distributed Database Spectrum Cengage Learning ©

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Key Assumptions of Hadoop Distributed File System High volume Write-once, read-many Streaming access Move computations to the data Fault tolerance 39

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Figure Hadoop Distributed File System (HDFS) 40

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. C. J. Date’s Twelve Commandments for Distributed Databases  Local site independence  Central site independence  Failure independence  Location transparency  Fragmentation transparency  Replication transparency 41

©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. C. J. Date’s Twelve Commandments for Distributed Databases  Distributed query processing  Distributed transaction processing  Hardware independence  Operating system independence  Network independence  Database independence 42