Download presentation
Presentation is loading. Please wait.
Published byRoderick Poole Modified over 9 years ago
1
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Distributed Databases Business needs for distributed databases Introduction to distributed databases Subscriber / Publisher Model Snapshots Transactional Replication Merge Replication Dissimilar Databases Implementing Distributed DB Design Implications Advantages & Disadvantages
2
IMS 4212: Distributed Databases 2 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Business Needs for Distributed Databases The concept of a central database to handle all of the organization’s needs has several potential limitations –Geographically dispersed organization requires extensive database traffic Large organization creates congestion at the server Large volumes of data must be moved across the network –The entire organization can be vulnerable to a problem with a single server –Data communications interruptions can disrupt the entire organization’s operations
3
IMS 4212: Distributed Databases 3 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Business Needs for Distributed Databases (cont.) Central database limitations (cont.) –Dissimilar operating units create differing data access needs Local units require autonomy over the design and implementation of DB systems Information sharing across the organization still requires connectivity Local unit DB designers will not be allowed to design against the entire DB
4
IMS 4212: Distributed Databases 4 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Business Needs for Distributed Databases (cont.) Central database limitations (cont.) –Mergers and acquisitions create ad-hoc integration of dissimilar DB systems Different business units may have fully developed DB and applications on dissimilar platforms, DBMS, etc. The organization still requires information sharing for organizational effectiveness Rewriting the whole system in a single DB is impractical (or may take time to implement)
5
IMS 4212: Distributed Databases 5 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Distributed Databases Distributed Databases are characterized by decisions made regarding: –Distribution of data schema All nodes share same schema or not –Update rights on objects (especially table data) –Latency / concurrency requirements –Commonality of DBMS
6
IMS 4212: Distributed Databases 6 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Subscriber/Publisher Model A susbcriber / publisher model is often used to describe database updates Nodes allowed to change data & objects are publishers Nodes needing to be aware of changes are subscribers Decisions are made on methods for making subscribers aware of changes and of getting changes to them –Near real time –On demand –Batch –On schedule
7
IMS 4212: Distributed Databases 7 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Snapshots Distribution of databases (except in connecting existing databases) usually start with a snapshot of all or part of a DB –Copy of structures, data, SP, triggers, etc. The snapshot is distributed to all nodes –May be different snapshots to different nodes
8
IMS 4212: Distributed Databases 8 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu A Scenario Corporate HQ is the central site Regional HQ or even ‘retail’ locations are Remote sites Remote sites execute frequent transactions Q: What data is needed in each location for the organization’s business needs?
9
IMS 4212: Distributed Databases 9 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Transactional Replication In transactional replication as each transaction is executed on any node it is ‘published’ to all subscribing nodes which also execute the transaction Data integrity rules are checked at each node Violation of a data integrity rule at any node can roll back the transaction at all nodes Data is kept relatively current at all nodes
10
IMS 4212: Distributed Databases 10 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Transactional Replication (cont.) Application (“business”) needs control urgency and frequency of updates Some data is read only at some nodes –Price schedule might be set centrally and only read locally –Sales transactions are probably executed locally and rolled up centrally
11
IMS 4212: Distributed Databases 11 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Transactional Replication When is Transactional Replication appropriate? –Higher interaction between actions at nodes (easier to cause conflicts with out of date data) –Decision making requires updated information –Frequent changes can cause concurrency problems –Connectivity is not an issue Detected problems can result in near-real time rollbacks
12
IMS 4212: Distributed Databases 12 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Merge Replication In Merge Replication subscribers may receive a partition of the data –Certain rows Only customers or employees in their region –Certain columns Employee contact info but not salary info Subscribers may add, update, or delete rows to which they have write access Changes are committed (published) to the subcribers in a batch (merged back into the subscriber DB)
13
IMS 4212: Distributed Databases 13 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Merge Replication (cont.) System is able to detect when remote site copy of data has changed (including new records) Changed data is marked for updating in central copy during merge
14
IMS 4212: Distributed Databases 14 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Merge Replication (cont) When is merge replication appropriate? –Few chances for node operations to create conflicts Highly autonomous activities Different lines of business –Infrequent changes requiring immediate awareness by all subscribers –Physical connectivity issues May create more complex problems when a conflict does occur –Rolling back already committed transactions
15
IMS 4212: Distributed Databases 15 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Dissimilar Databases Distributed DB nodes may be dissimilar on two dimensions –DB architecture (table structure, field data types/names, etc.) –DBMS and OS (may not even be relational data) “Messages” sent between nodes to inform them of updates must be translated somewhere Imposes new layers of complexity for connectivity SQL Server provides support for this process Many third party products for logical integration
16
IMS 4212: Distributed Databases 16 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Implementing DB Distribution SQL Server comes with a wealth of distributed DB management tools –Specify publication schedules, rights, update frequencies, etc. –Manage conflicts when they occur and notify clients –Perform translations between DBMS –Perform translations between structures
17
IMS 4212: Distributed Databases 17 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Design Implications Some DB designs may change when the DB is replicated –Relationships may not be enforced in remote nodes because matching parent rules may not exist –GUID attributes may be needed for PKs since independently generated Identity attributes could conflict when rolled up –Triggers or constraints may be different May violate locally but be OK globally Vice-versa
18
IMS 4212: Distributed Databases 18 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Database Distribution Advantages & Tradeoffs Key advantages of distributed DB –Increased reliability –Local access and control –Modular growth –Lower communication costs –Faster response What are the mechanisms that give rise to these advantages?
19
IMS 4212: Distributed Databases 19 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Database Distribution Advantages & Tradeoffs (cont.) Disadvantages of distributed DB –Software cost & complexity Keeping data current Maintaining data integrity Integrating multiple sites and applications –Processing overhead –Data integrity –Slow response from poor design
20
IMS 4212: Distributed Databases 20 Dr. Lawrence West, Management Dept., University of Central Florida lwest@bus.ucf.edu Distributed DBMS (cont.) Distributed DBMS attempts to achieve “Location Transparency” –User or application will not need to know that the query is going to multiple nodes –User has one integrated DB schema –Distributed DBMS performs all network operations Also seek to achieve “Replication Transparency” –Replication operations are performed automatically –Manages multiple updates against different copies of replicated data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.