Download presentation
Presentation is loading. Please wait.
Published byJerome Leonard Modified over 9 years ago
1
Kjell Orsborn 2015-11-13 1 UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala University, Uppsala, Sweden
2
Kjell Orsborn 2015-11-13 2 UU - DIS - UDBL Introduction to Distributed DBMSs (Elmasri/Navathe ch. 24) Distributed DBMS (ch. 24.4 and 24.5 are omitted) Kjell Orsborn Uppsala Database Laboratory, Department of Information Technology, Uppsala University, Uppsala, Sweden
3
Kjell Orsborn 2015-11-13 3 UU - DIS - UDBL Distributed DBMSs A distributed database (DDB) is a collection of several logically interrelated databases distributed over a computer network including a number of computers (nodes). A distributed database mangement system (DDBMS) is a software system that permits management of DDB’s and that makes the distribution transparent for the user. A DDB is not: –a collection of files (need structure and DB manager) –a client-server interface to a database data on one node, clients on other nodes in network (almost) every centralized DBMS has client-server interface
4
Kjell Orsborn 2015-11-13 4 UU - DIS - UDBL Background What is a Distributed System? A Distributed System is a number of autonomous computers communicating over a network with software for integrated tasks. Examples of Distributed Systems: SUN’s Network File System (NFS), distributed file system
5
Kjell Orsborn 2015-11-13 5 UU - DIS - UDBL Distributed DBMSs... Centralized database in a network Node 1 Node 5 Node 2 Node 3 Node 4 communication network Node 1 Node 5 Node 2 Node 3 Node 4 communication network Distributed database over several nodes in a network
6
Kjell Orsborn 2015-11-13 6 UU - DIS - UDBL Centralized Database Server Stream (row-by-row) based client-server interfaces DBMS specific interfaces Compiler integrated interfaces (embedded SQL) ODBC: SQL-based standardized subroutine call library (Microsoft) JDBC: ODBC for Java (not Microsoft)
7
Kjell Orsborn 2015-11-13 7 UU - DIS - UDBL Distributed Databases Database seen as one unit; queries and updates to ONE database. Data in database transparently distributed over many DB nodes. Manual partitioning or fragmentation of data tables. DBMS automatically optimizes queries and updates to distributed database.
8
Kjell Orsborn 2015-11-13 8 UU - DIS - UDBL Multi-Databases Database seen as several heterogeneous units Multi-database query language needed to combine data from the databases. Primitives needed to integrate (combine, fuse) data from the databases. Special query optimization techniques to deal with heterogneity and dynamism.
9
Kjell Orsborn 2015-11-13 9 UU - DIS - UDBL Example of Multi-Database Automatic Teller Machines, ATMs
10
Kjell Orsborn 2015-11-13 10 UU - DIS - UDBL Fragmentation of data data fragmentation (= data partitioning) division of data sets (e.g. a relation) into several pieces - fragments transparently stored on several different nodes increased accessability and performance several types of fragmentation: –horisontal fragmentation –vertical fragmentation –mixed fragmentation good when nodes far apart
11
Kjell Orsborn 2015-11-13 11 UU - DIS - UDBL Replication of data copies of the same data on several nodes increased reliability and access performance more complex updating, transactions handling, recovery. –updates must be propagated to each replica! –special procedures after failures to restore consistency –more problematic transaction synchronization! types of replication: –full replication (whole db at each node) –no replication (each fragment only at one node) –partial replication (certain fragments replicated) –not necessary to replicate all tables –full replication often not realistic!
12
Kjell Orsborn 2015-11-13 12 UU - DIS - UDBL Transparency in a DDBMS By transparency we here mean the hiding of basic implementation details from one abstraction level to another. Data independence –logical data independence –physical data independence Network transparency –protect user from operational details of network –hides the existence of a network –no machine names in database table references location transparency naming transparency Replication transparency –user should not experience data replicas –automatic handling of updates, such as replica propagation –automatic handling of node crasches Fragmentation transparency –hides the existence of fragments e.g. that a logical relation is horizontally fragmented into local physical tables –handling of transformation of global queries to fragmented queries
13
Kjell Orsborn 2015-11-13 13 UU - DIS - UDBL Advantages of Distributed DBMSs... Data sharing –uniform interface and sharing of data through the DDBMS –natural to distribute certain database applications Increased reliability –redundance increase security and accessability –crashes less severe (if application not dependent of non-local data) Local independence –allows sharing of data but keeps local control of data Improved performance –avoid unnecessary data transfer Expandibility –easy to add new nodes (not always linear scale up due to central directory) Local autonomy –local control –local policies
14
Kjell Orsborn 2015-11-13 14 UU - DIS - UDBL Problems with Distributed DBMSs... Complexity –database administration becomes more complex (such as recovery) –increased complexity of system design, implementation and maintenance Security –keep security in a network harder Networking a known problem Distributed administration –less control and more meetings Cost –hardware - software - development/maintenance
15
Kjell Orsborn 2015-11-13 15 UU - DIS - UDBL Problems with Distributed DBMSs... Distributed schema management –schema is accessed whenever SQL query issued! –global directory => Central Database becomes hot spot –local directories => Data replication –=> Since schema is not updated often but need to be accessed very often it is normally fully replicated by the DDBMS. Distributed concurrency control –consistency of replicas: mutual consistency Distributed deadlock management Reliability of DDBMS –consistency of replicas –bring up (fragmented) database at failed sites OS Support –multiple layers of network software
16
Kjell Orsborn 2015-11-13 16 UU - DIS - UDBL Additional functionality required by DDBMS Access of physically divided databases - schema management Handling of distribution and replication of data –which copy of data should for example be used Handling of consistency of replicated data Handling of distributed queries Handling of distributed transactions (over several network nodes) Handling of recovery/restart from crashes (of nodes) and new types of errors such as communication errrors/failures.
17
Kjell Orsborn 2015-11-13 17 UU - DIS - UDBL Distributed database design Goal: –to minimize the combined cost of maintaining data, recieve efficient communication and good performance for transactions. Problems: –where (on which node/nodes) shall data and applications be placed –partitioning of data (split data into distributed partitions) –replication of data (copies of data on several nodes) –NP-complete optimization problem. –distributed query processing automatically done by distributed query processor of DDBMS analyze query --> distributed execution plan factors: –data replication –data availability –communication costs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.