Download presentation
Presentation is loading. Please wait.
1
Information Resources Management April 17, 2001
2
Agenda n Administrivia n Database Architectures
3
Administrivia n Homework #8
4
Database Architectures n Centralized n Client-Server n Parallel - single site n Distributed - multiple sites
5
Database Architectures Centralized (Parallel) Distributed Client-Server Function Data
6
Centralized n PC, Mini, or Mainframe n Single Database n Single Database Manager n One or More Users n Data and Function in One Place
7
Client-Server n PCs to Mainframes to Minis n PC to PC n Mainframe to Mainframe n Use Desktop Processing Power n Better User Interface n Greater Functionality n Retain Centralized Control of Data
8
Client-Server: Basic Model ServerClient Request Result
9
Servers n Supercomputer n Mainframe n Mini n PC Server n All retain all data
10
Client-Server Architecture Data Function Server (Back-End) Client (Front-End) Thin Client Fat Client
11
Functionality n Presentation n I/O Processing n Validation n Business Rules n Application Logic n Data Management n Validation n Error Handling
12
“Thin” Client n Presentation Services Only n Accept Input n Format Output n Display n Server does all processing
13
“Fat” Client n Presentation n Validation n Application Logic - Programs n Data Management n Send SQL to Server n Server is just DBMS
14
“In Between” Client n Client n Presentation n Some Application Logic n Server n Some Applicaton Logic n Data Management and Services
15
Benefits of Client-Server n Use Local Processing Power n Better User Interface n Some Functionality if System Down n Use Sunk Costs of PCs n Support Reengineering n Support Intranets n Flexibility, Scalability, Customizeability
16
Challenges of Client-Server n Cost of (Upgraded) PCs n Network Reliance n Distributing Application Updates n Management of Complex System n Problem Identification & Resolution n Application Partitioning
17
Other Client-Server Architectures n Traditional is Two-Tiered (client-server) n Three-Tiered n Client-Application Server-DB Server n (PC - Mini - Mainframe) n (PC - PC Server - Mainframe) n Beyond Three n PC - PC Server - Web Server - Mini - Mainframe
18
Client-Server vs. Distributed n Client-Server: Application Distribution n Distributed: Data Distribution Often, “client-server” is used to refer to either application distribution or data distribution or both.
19
Middleware n What if n Multiple databases (sources) need to be accessed from a single client? n Different kinds of clients? n Mix of clients and servers? n Want to take advantage of existing base of applications (legacy systems)?
20
Middleware n Fat Clients just send SQL transactions n Other types of transactions may be needed based on the server (system)
21
Middleware Software that shields applications from the complexity of the operating environment. Client Middleware System (Legacy) System (Legacy)
22
Types of Middleware n Transaction Process (TP) Monitor n Database Middleware n Remote Procedure Call (RPC) n Message-Oriented Middleware (MOM) n Object-Request Brokers n (CORBA - ORB)
23
TP Monitor n Synchronous - sender must wait n Queuing n Message Delivery n Insured Delivery n Either Direction
24
Database Middleware n Variety of Clients/Platforms n Variety of Servers/DBMSs/Platforms n Specific to DB transactions (SQL)
25
Message-Oriented Middleware (MOM) n Asynchronous - clients do not wait n Queues & Queue Management/Recovery n Message Delivery n Insured Delivery n Either Direction (like email or EDI only transactions)
26
Advantages of Middleware n Leverage sunk costs (legacy systems) n Reduce development cost n Reduce development time n Increase responsiveness n Improve overall systems management n Consolidate diffuse information
27
Challenges of Middleware n Cost n Session management - Transaction state n Security n Network reliance n Diversity of systems - lack of standards n Constant technology change n Availability of talent n Middleware Management
28
Parallel and Distributed n Client-Server is an attempt to improve performance n Reduce time to execute a transaction n Parallel n Reduce time to get the data n Distributed
29
Parallel Systems n Single site for data n Very Large databases n Operations performed simultaneously
30
Parallel Database Architecures n Shared Memory n Shared Disk n Shared Nothing n Hierarchical
31
Shared Memory P P P M
32
n Advantages n Extremely efficient communications n Disadvantages n Max of 32/64 processors n Bus becomes bottleneck
33
Shared Disk P P P M M M
34
n Advantages n No bus bottleneck n Fault tolerance provided n Disadvantages n Disk access becomes bottleneck
35
Shared Nothing P P P M M M
36
n Advantages n No disk bottleneck n Highly scaleable n Disadvantages n High communication overhead/cost n Between processors n To another processor’s data
37
Hierarchical P P P P P M M M
38
Hierarchical n Advantages n Best of all worlds n Disadvantages n Worst of all worlds n Some high communcation overhead/cost n Between subsystems n Complexity
39
Distributed Databases n Client-Server - distribute functionality n What about distributing data?
40
Distributed Databases n Overview n Distributed Storage n Distributed Queries n Distributed Transactions n Multidatabase (Middleware)
41
Distributed Databases n Multiple locations n Single logical database n Several physical databases n Network connections
42
Advantages n Sharing across locations n Local control n Availability
43
Challenges n Development costs n People & Equipment n Testing n Problem identification & resolution n Technical expertise n Network dependence n Increased processing overhead
44
Distributed Data Storage n Replication n Fragmentation n Both
45
Replication n Data is repeated n Spectrum of options available n Temporary replication of specific rows n Replicate infrequently changed data n Replicate by site n Central site - all / each local site - their data only n Full replication n Everything everywhere
46
Concerns with Replication n Availability needed n Amount of parallelism in reads n Overhead of updates n Keeping replicas updated n Conflicting updates
47
Fragmentation n Partitioning n Divide data into subsets based on need n Have to be able to pull back together to get original tables
48
Fragmentation n Horizontal n by rows n specified conditions n Vertical n by column n each requires primary key (or created key) n Mixed n by row and column
49
Fragmentation & Replication n Repeat as necessary: n Replicate fragments n Fragment replicas n Don’t lose track of what you have and where it is!
50
Network Transparency n Distributing data should not require that the user know where or how it’s been distributed. n The database should be seen as a single entity no matter how fragmented and replicated it becomes.
51
Network Transparency n Some DBMSs are starting to provide this level of functionality so transparency exists even at the program level, but in many cases this “transparency” must be programmed into the applications. n It must always be designed into the database.
52
Distributed Queries n How do you query data that is everywhere?
53
Effeciency vs. Overhead n Splitting the query apart n Keeping track of the data/locations n Making sure everything gets executed n Putting the results back together n Generating network traffic n Handling partial results
54
Distributed Queries n Full replication can avoid the overhead n Huge increase in update overhead n Parallel execution no longer possible n Additional costs of replication
55
Example n 5 sites - NY, Pgh, Chicago, Dallas, Los Angeles n Data fragmented by site - no replication n Query (in Pgh): SELECT Name, Max (Salary) from Employee
56
Option 1 - High Bandwidth 1. Have all sites send their full employee tables to Pgh. 2. Build a temporary employee table. 3. Run the query against this table.
57
Option 2 - Not so High Bandwidth 1. Examine the query and determine it can be run separately at each location and the results combined. 2. Submit just the query to each location. 3. Wait for the results from each city. 4. As results return, build a temporary table (5 rows only). 5. Find the max using the temporary table.
58
Distributed Transactions n Transaction Types n Coordinators n Commit Protocols n Concurrency Controls n Deadlocks
59
Transaction Types n Local - transaction only needs local data n Global - transaction uses non-local data n My global becomes someone else’s local n Either type of transaction must still have ACID properties - global is the concern
60
System Structure n Things to do: 1. Process local transactions (transaction manager) 2. Process and track global transactions (transaction coordinator)
61
Global Processing 1. Recognize as global 2. Break up transaction 3. Distribute pieces 4. Assemble results 5. Coordinate termination 6. Handle problems
62
Coordinator of Coordinators n Coordinate among sites n Detect problems n Attempt to fix n Share status with others
63
Coordinator Failure n Backup Coordinator n receives all messages - maintains state n monitors coordinator n automatically takes over if coordinator down n avoids delays - increases overhead n Election n highest pre-assigned number
64
Commit Protocols n Two-Phase n Three-Phase n All sites must commit or all sites have to rollback n Replicated data only
65
Two-Phase Commit n Phase 1 n Send PREPARE to all sites n Sites respond READY or ABORT n Phase 2 n If all sites READY, n COMMIT locally - Send COMMITs n If not READY or time expires n ROLLBACK locally - Send ROLLBACK
66
Two-Phase Commit Coordinator Site Site requests commit
67
Two-Phase Commit - Phase 1 Coordinator Site Send PREPARE - all sites
68
Two-Phase Commit - Phase 1 Coordinator Site Sites respond READY
69
Two-Phase Commit - Phase 2 Coordinator Site COMMIT locally
70
Two-Phase Commit - Phase 2 Coordinator Site Send COMMIT - all sites
71
Two-Phase Commit - Phase 1 Coordinator Site Site responds ABORT or does not respond
72
Two-Phase Commit - Phase 2 Coordinator Site ROLLBACK locally
73
Two-Phase Commit - Phase 2 Coordinator Site Send ROLLBACK - all sites
74
Site Failure - Recovery n COMMIT and ROLLBACK as normal n If READY only n Check with coordinator or other sites n Either COMMIT or ROLLBACK n If no one found, ROLLBACK
75
Coordinator Failure n Ask the sites n If one has COMMIT, then REDO n If one has ROLLBACK, then UNDO n If one doesn’t have READY, UNDO n If all READY only n Coordinator must decide n Sites must wait and locks are held n “Blocking” occurs
76
Three-Phase Commit n Phase 1 n Sent PREPARE n Sites respond READY or ABORT n Phase 2 n If all sites READY, send PRECOMMIT n Else, ROLLBACK n Sites must ACKNOWLEDGE n Phase 3 n If at least K sites ACKNOWLEDGE, send COMMIT
77
Coordinator Failure n Three-Phase Commit prevents blocking n If coordinator fails n New coordinator is selected n Sites queried to determine status n New coordinator resumes
78
Network Partitioning n Network split creates two separate networks n Each “half” selects a coordinator n Coordinators make independent decisions n Result could be different decisions n Resolution of network problem may create need to resolve database problems
79
Concurrency Control n Single Lock Manager n Multiple Lock Managers
80
Single Lock Manager n One site for all locking n All other sites must go to it n Can read from anywhere n Updates must be to all copies n Advantages: Simple, Easy deadlock detection n Disadvantages: Bottleneck, Vulnerability
81
Simple Multiple Lock Mgrs n Each site locks a unique partition of the data n non-replicated data n Advantages: Fairly simple, reduced bottlenecks n Disadvantages: Complicated deadlock detection
82
Majority Protocol n Each site locks its own data n replication possible n Request owner for lock on data that isn’t local n When multiple owners, n/2 + 1 (majority) must provide the lock n Advantages: No bottlenecks n Disadvantages: More messages sent, Complicated deadlock detection, More deadlocks (each gets 1/2)
83
Biased Protocol n Reduced form of Majority Protocol n For a READ, only need any single lock n For a WRITE, need all locks n Advantages: No bottle necks, Reduced traffic n Disadvantages: Update traffic, Deadlocks
84
Primary Copy n Site designated to hold “primary” copy n Multiple sites n Replicated Data n All locks through that site n Advantages: Fairly simple, reduced bottlenecks n Disadvantages: Vulnerability, Complicated deadlock detection
85
Other Than Locking n Timestamps n Centralized generation n Local generation n Timestamp tests determine ability to read or write
86
Deadlocks & Distributed Data n Centralized n One Site n Distributed n Centralized - same advantages and disadvantages as other centralized control (database or locking)
87
Distributed Deadlock Detection n Each site tracks all transactions accessing its own data n Dummy transaction for transactions that originated here but are executing elsewhere n If deadlock found that includes dummy transaction n Must send deadlock information to other sites n They check for deadlock n May have to pass on to another site
88
Homework #9 n Continuuing with the Carnegie Library n Client/Server n Distrributed Database
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.