Download presentation
Presentation is loading. Please wait.
Published byJohn Montgomery Modified over 10 years ago
1
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3
2
2 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Outline generalities objectives problems
3
3 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College 1
4
4 Introduction communication network server application server DBMS in its own right
5
5 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Introduction distributed database = collection of connected sites each site is a DB in its own right (1) has its own DBMS and its own users operations can be performed locally as if the DB was not distributed the sites collaborate (transparently from the users point of view) the union of all DBs = the DB of the whole organisation (institution) (oppose to (1)) physical or logical distribution strict homogeneity (assumption)
6
6 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Motivation advantages matches the structure of the organisation example efficiency of processing stored closely to where it is being used increased accessibility remote DBs can be accessed disadvantage complexity
7
7 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Implementations (systems) commercial ORACLE ( Oracle Corporation ) INGRES/STAR ( Ask Group Inc. Ingres Division ) DB2 ( IBM ) they all provide some sort of features for distributed databases
8
8 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Fundamental principle a distributed DB system should look to the user exactly as a non-distributed DB system
9
9 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College 2
10
10 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Objectives local autonomy no reliance on central site location independence fragmentation independence replication independence distributed query processing distributed transaction management
11
11 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Objectives are: not independent from each other not exhaustive sometimes contradicting different degree of importance (for the user)
12
12 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Local autonomy all operations at a certain site are fully controlled by that site not achievable (why?) therefore, autonomy should be achieved to the maximum extent possible local data is locally owned and managed local data belongs to the local server even if it is accessible from other servers security, integrity,..., are in the responsibility of the local server
13
13 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College No reliance on a central site reasons bottle-neck vulnerability conclusion all sites must be equal
14
14 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Location independence users should not have to know where data is physically stored why do you think this is needed? think of application programs what does this objective look like?
15
15 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Data fragmentation data fragmentation if a relation can be divided into fragments for storing purposes motivation: performance - data is stored where it is mostly used definition fragment = any subrelation derivable via restriction or projection
16
16 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College FRAGMENT Emp INTO Lo_Emp AT SITE London WHERE Dept_id = Sales Le_Emp AT SITE Leeds WHERE Dept_id = Dev ; Data fragmentation - example
17
17 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Fragmentation independence / transparency users should perceive data as if it were not fragmented why? it is the optimisers responsibility to determine which fragments need to be physically accessed similar to views retrieving updating (JOIN and UNION views)
18
18 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Data replication copies of the same fragment can exist at different sites reasons better availability better performance disadvantage update propagation
19
19 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Replication independence / transparency users should not have to be aware of data replication it is the optimisers responsibility to choose which replica to use commercial systems not full support for replication independence (update problems) - primary copy
20
20 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed query processing the system must have set level operators one record at a time - too many messages (traffic) relational - indicated optimisation particularly relevant! find best way to move data across the network
21
21 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College 3
22
22 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Problems occur due to network utilisation aim minimise network utilisation query processing catalogue management update propagation recovery control concurrency control
23
23 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Query processing in a distributed environment query execution is distributed query optimisation is distributed global optimisation local optimisation example query on relation R issued at site X part of R, say R y, stored at Y part of R, say R z, stored at Z where is the query going to be executed?
24
24 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Catalogue management what other data does the catalog include? fragmentation, replication... where should the catalogue be stored centralised fully replicated loss of autonomy - update propagation! partitioned non local operations - very expensive! combination of first and third
25
25 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Central Catalogue all updates, including local updates, have to be recorded in the central catalogue disadvantages: bottleneck conflicts with the no reliance on a central site objective
26
26 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Fully Replicated Catalogue the entire database catalogue (not only the local one) is stored at each site every time an update is made, it has to be recorded at each site disadvantages loss of local autonomy time and network traffic consuming updates
27
27 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Update propagation problems because of replication data might become less available primary copy scheme one copy is designated primary copy (unique) primary copies exist at different sites (distributed) an update is logically complete if the primary copy has been updated the site holding the primary copy would have to propagate the updates violation of local autonomy
28
28 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Concurrency control locking overhead - increased number of messages primary copy strategy locking only the primary copy the primary copys site will propagate the update loss of autonomy (severely) global deadlock two interlocked (waiting for each other) sites cannot be detected using the wait-for graph - therefore, communication overhead
29
29 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College
30
30 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Conclusion generalities objectives – in brief problems – in brief
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.