Distributed Database Management Systems
Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE 8242
FarkasCSCE 8243 Database Management System (DBMS) Collection of Collection of –interrelated data and –set of programs to access the data Convenient and efficient processing of data Convenient and efficient processing of data Database Application Software Database Application Software
FarkasCSCE 8244 Abstraction View level: different perspectives View level: different perspectives –application programs hide irrelevant data Logical level: data models Logical level: data models –Logical representation of data –Different approaches: relational, hierarchical, network, object oriented, semi-structured, etc. – Data independence principle Physical level: how data is stored Physical level: how data is stored
Motivation for DBMS Integrate related data Integrate related data Provide centralized and controlled access to data Provide centralized and controlled access to data FarkasCSCE 8245
Computer Network Distributed processing: Distributed processing: –Number of autonomous processing elements that are interconnected by computer network –Cooperate to perform their assigned tasks FarkasCSCE 8246
What to distribute? Processing logic/element Processing logic/element Functions Functions Data Data Control of execution Control of execution FarkasCSCE 8247
Why to distribute? Intuition Intuition Reliability Reliability Performance Performance FarkasCSCE 8248
Distributed Database Systems Distributed database: Distributed database: –Collection of multiple, logically interrelated databases that are distributed over a computer network Distributed DBMS: software system that Distributed DBMS: software system that –Permits the management of the distributed database and –Makes the distribution transparent to the user FarkasCSCE 8249
Data Delivery Data storage and query processing Data storage and query processing Data delivery: Data delivery: –Delivery mode: push, pull, hybrid –Frequency: periodic, conditional, ad-hoc, irregular –Communication method: unicast, one-to-many FarkasCSCE 82410
DDBMS Services Transparent data management Transparent data management –Distributed, replicated data –Transparency: network, replica, fragmentation Reliable access to data Reliable access to data –Distributed transactions –Failure atomicity Improved performance Improved performance Flexible expansion Flexible expansion FarkasCSCE 82411
Difficulties Everything that is present in traditional DBs Everything that is present in traditional DBs Fragmentation and replica control Fragmentation and replica control –Data retrieval –Data update Dealing with failures Dealing with failures Synchronization Synchronization FarkasCSCE 82412
DDBMS Issues Database design Database design Directory management Directory management Query processing Query processing Concurrency control Concurrency control Deadlock management Deadlock management Reliability Reliability Replication Replication FarkasCSCE 82413
DDBMS Architecture Chapter 1.7 (read only) Chapter 1.7 (read only) –Client/server –P2P –Multi-database FarkasCSCE 82414
DISTRIBUTED DATABASE DESIGN FarkasCSCE 82415
Design Issues Placing of data and programs (DBMS and application) Placing of data and programs (DBMS and application) Network issues Network issues FarkasCSCE 82416
Level of Sharing No sharing No sharing Data sharing Data sharing Data and program sharing Data and program sharing FarkasCSCE Heterogeneous environment!
Access Pattern Static Static Dynamic Dynamic FarkasCSCE 82418
Level of Knowledge on Access Behavior Complete information Complete information Partial information Partial information FarkasCSCE 82419
Top-Down Design Figure 3.2 Figure 3.2 FarkasCSCE 82420
Fragmentation Why to fragment the data? Why to fragment the data? –Application views –Limit replication while increase availability –Increased concurrency Farkas CSCE 82421
Fragmentation Types: Types: –Horizontal –Vertical –Hybrid Degree: Degree: –From no fragmentation to individual tuples/attributes FarkasCSCE 82422
Correctness of Fragmentation 1. Completeness: F R ={R 1, …, R n } 2. Reconstruction: R= R i, R i R 3. Disjointness: –Horizontal: does not d j R i such that d j R k where k i –Vertical: same as horizontal for non- primary key attributes FarkasCSCE &2: Lossless-join (normalization)
Allocation Replication or single copy? Replication or single copy? –Read-only transactions Issues: (Figure 3.6) Issues: (Figure 3.6) –Query processing –Directory management –Concurrency control –Reliability Real world applications Real world applications FarkasCSCE 82424
Fragmentation Design Information need: Information need: –Database information –Application information –Communication network information –Computer system information FarkasCSCE 82425
Horizontal Fragmentation Primary horizontal fragmentation: defined by selection operation on the relations of a database schema, R i = Fi (R) Primary horizontal fragmentation: defined by selection operation on the relations of a database schema, R i = Fi (R) Correctness: Correctness: –Completeness –Reconstruction (union) –Disjointness FarkasCSCE 82426
Vertical Fragmentation R={R 1, …, R n }, where each R i (i=1, …, n) contains a primary key and some of the attributes in R R={R 1, …, R n }, where each R i (i=1, …, n) contains a primary key and some of the attributes in R More difficult than horizontal fragmentation – heuristics More difficult than horizontal fragmentation – heuristics –Grouping –Splitting FarkasCSCE 82427
Vertical Fragmentation Correctness: Correctness: –Completeness –Reconstruction (join) –Disjointness FarkasCSCE 82428
Hybrid Fragmentation Horizontal or vertical fragmentations are not sufficient the user application requirements Horizontal or vertical fragmentations are not sufficient the user application requirements Nested or mixed fragmentation Nested or mixed fragmentation FarkasCSCE 82429
Data Directory Global vs. local conceptual schemas Global vs. local conceptual schemas –How to search? –Where to store? –Single vs. multiple copies? FarkasCSCE 82430
Current Research Allocation: new requirements, technology, etc. Allocation: new requirements, technology, etc. Where to store the fragments? Where to store the fragments? Dynamic environment Dynamic environment –Usage pattern –Application characteristics –Network changes –Security FarkasCSCE 82431
FarkasCSCE Next Class Commit Protocols