Download presentation
Presentation is loading. Please wait.
Published byJewel Hensley Modified over 9 years ago
1
1 Distributed and Parallel Databases
2
2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations Multiple CPU's – each has DBMS, but data distributed Loosely coupled –homogeneous –heterogeneous - different DBMSs - need ODBC, standard SQL
3
3 Advantages of DDBs distributed nature of some DB applications (bank branches) increased reliability and availability if site failure - also replicate data at > 1 site data sharing but also local control improved performance - smaller DBs exist at each site easier expansion
4
Slide 25- 4
5
Client-Server Client-Server (b) in figure –Client sends request for service (strict – fixed roles) –3-tier architecture Presentation tier Logic tier Data Tier 5
6
6 Distributed DBSs (DDBS) Distributed DB (c) in figure –WAN –Multiple CPU's – each has DBMS, but data distributed –lower communication rates –Heterogeneous machines –Homogeneous DDBS homogeneous – same DBMSs –Heterogeneous DDBS different DBMSs - need ODBC, standard SQL
7
7 Heterogeneous distributed DBSs HDDBs Data distributed and each site has own DBMS ORACLE at one site, DB2 at another, etc. need ODBC, standard SQL usually transaction manager responsible for cooperation among sites must coordinate distributed transaction need data conversion and to access data at other sites
8
P2P –Every site can act as server to store part of DB and as client to request service 8
9
9 Federated DB - FDBS federated DB is a multidatabase that is autonomous (a) in figure collection of cooperating DBSs that are heterogeneous preexisting DBs form new database Each DB specifies import/export schema (view) –keeps a partial view of total schema Each DB has its own local users, local transparency and DBA – appears centralized for local autonomous users – appears distributed for global users
10
10 DDBS Issues in DDBS in slides that follow
11
11 Replication Full vs. partial replication Which copy to access Improves performance for global queries but updates a problem Ensure consistency of replicated copies of data
12
12 Data fragments Can distribute a whole relation at a site or Data fragments – logical units of the DB assigned for storage at various sites – horizontal fragmentation - subset of tuples in the relation (select) – vertical fragmentation - keeps only certain attributes of relation (project) need a PK
13
13 Fragments cont’d Horizontal fragments: – disjoint - tuples only member of 1 fragment salary < 5000 and dno=4 –complete - set of fragments whose conditions include every tuple –Complete vertical fragment: L1 U L2 U... Ln - attributes of R Li intersect Lj = PK(R)
14
14 Example replication/fragmentation Example of fragments for company DB: site 1 - company headquarters gets entire DB site 2, 3 – horizontal fragments based on dept. no.
15
Slide 25- 15
16
16 Increased complexity Additional functions needed: global vs. local queries keep track of data and replication execution strategies if data at > 1 site –which copy to access –maintain consistency of copies
17
17 To process a query Must use data dictionary that includes info on data distribution among servers Ensure atomicity Parse user query –decomposed into independent site queries –each site query sent to appropriate server site –site processes local query, sends result to result site –result site combines results of subqueries
18
18 Architectures Distributed Systems goal: to offer local DB autonomy at geographically distributed locations versus Parallel Systems goal: to construct a faster centralized computer –Improve performance through parallelization –Distribution of data governed by performance –Processing, I/O simultaneously
19
19 Parallel DBSs Shared-memory multiprocessor –get N times as much work with N CPU's access –MIMD, SIMD - equal access to same data, massively parallel Parallel shared nothing –data split among CPUs, each has own CPU, divide work for transactions, communicate over high speed networks LANs - homogeneous machines CPU + memory - called a site
20
Query Parallelism Decompose query into parts that can be executed in parallel at several sites –Intra query parallelism If shared nothing & horizontally fragmented: Select name, phone from account where age > 65 –Decompose into K different queries –Result site accepts all and puts together (order by, count) What if a join and table is fragmented? 20
21
21 Other issues Distributed concurrency control using locking New models –Cloud computing
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.