1 Scaleable Replicated Databases Jim Gray (Microsoft) Pat Helland (Microsoft) Dennis Shasha (Columbia) Pat ONeil (U.Mass)

Slides:



Advertisements
Similar presentations
You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Set Up Instructions Place a question in each spot indicated Place an answer in each spot indicated Remove this slide Save as a powerpoint slide show.
1 Concurrency: Deadlock and Starvation Chapter 6.
Advanced Piloting Cruise Plot.
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
DCV: A Causality Detection Approach for Large- scale Dynamic Collaboration Environments Jiang-Ming Yang Microsoft Research Asia Ning Gu, Qi-Wei Zhang,
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Topic 1Topic Q 1Q 6Q 11Q 16Q 21 Q 2Q 7Q 12Q 17Q 22 Q 3Q 8Q 13Q 18Q 23 Q 4Q 9Q 14Q 19Q 24 Q 5Q 10Q 15Q 20Q 25.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
Determine Eligibility Chapter 4. Determine Eligibility 4-2 Objectives Search for Customer on database Enter application signed date and eligibility determination.
My Alphabet Book abcdefghijklm nopqrstuvwxyz.
0 - 0.
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
Addition Facts
Year 6 mental test 5 second questions
Data recovery 1. 2 Recovery - introduction recovery restoring a system, after an error or failure, to a state that was previously known as correct have.
Distributed databases 1. 2 Outline introduction principles / objectives problems.
Concurrency control 1. 2 Introduction concurrency more than one transaction have access to data simultaneously part of transaction processing.
1 Term 2, 2004, Lecture 9, Distributed DatabasesMarian Ursu, Department of Computing, Goldsmiths College Distributed databases 3.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
ZMQS ZMQS
Database Systems: Design, Implementation, and Management
Selling Goods and Services May 2010©Kimberly Lyons 1.
ANALYZING AND ADJUSTING COMPARABLE SALES Chapter 9.
Banking Services AVAILABLE FOR A SMALL BUSINESS. BANKING SERVICES 2 Welcome 1. Agenda 2. Ground Rules 3. Introductions.
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
Welcome to Access Online for State of Iowa Cardholders.
Concurrency Control Techniques
ABC Technology Project
Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
VOORBLAD.
4 Square Questions Are you ready? B A
1 4 Square Questions B A D C Look carefully to the diagram Now I will ask you 4 questions about this square. Are you ready?
Squares and Square Root WALK. Solve each problem REVIEW:
Lecture plan Transaction processing Concurrency control
Indra Budi Transaction Indra Budi
Chapter 5 Test Review Sections 5-1 through 5-4.
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
A SMALL TRUTH TO MAKE LIFE 100%
1 PART 1 ILLUSTRATION OF DOCUMENTS  Brief introduction to the documents contained in the envelope  Detailed clarification of the documents content.
Outline Introduction Background Distributed Database Design
Suvision Holdings Private Limited, No 16, Apple Villa, Lalbagh Road, Bangalore CRM Training Manual.
Introduction to ikhlas ikhlas is an affordable and effective Online Accounting Solution that is currently available in Brunei.
Chapter 24 Replication and Mobile Databases Transparencies © Pearson Education Limited 1995, 2005.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Mobile Computing and Databases - A Survey A presentation by Dharmesh Thakkar based on the publication by Daniel Barbara.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Practical Replication. Purposes of Replication Improve Availability Replicated databases can be accessed even if several replicas are unavailable Improve.
CS Storage Systems Dangers of Replication Materials taken from “J. Gray, P. Helland, P. O’Neil, and D. Shasha. The Dangers of Replication and a.
Jan 31, 2001CSCI {4,6}900: Ubiquitous Computing1 Recap. Ubiquitous Computing Vision –The Computer for the Twenty-First Century, Mark Weiser –The Coming.
Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
Feb 1, 2001CSCI {4,6}900: Ubiquitous Computing1 Eager Replication and mobile nodes Read on disconnected clients may give stale data Eager replication prohibits.
1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.
An Architecture for Mobile Databases By Vishal Desai.
Presentation transcript:

1 Scaleable Replicated Databases Jim Gray (Microsoft) Pat Helland (Microsoft) Dennis Shasha (Columbia) Pat ONeil (U.Mass)

2 Outline u Replication strategies –Lazy and Eager –Master and Group u How centralized databases scale –deadlocks rise non-linearly with F transaction size F concurrency u Replication systems are unstable on scaleup u A possible solution

3 Scaleup, Replication, Partition u N 2 more work

4 Why Replicate Databases? u Give users a local copy for –Performance –Availability –Mobility (they are disconnected) u But... What if they update it? u Must propagate updates to other copies

5 Propagation Strategies u Eager: Send update right away –(part of same transaction) –N times larger transactions u Lazy: Send update asynchronously –separate transaction –N times more transactions u Either way –N times more updates per second per node –N 2 times more work overall

6 Update Control Strategies u Master –Each object has a master node –All updates start with the master –Broadcast to the subscribers u Group –Object can be updated by anyone –Update broadcast to all others u Everyone wants Lazy Group: –update anywhere, anytime, anyway

7 Quiz Questions: Name One u Eager –Master:N-Plexed disks –Group: ? u Lazy –Master: Bibles, Bank accounts, SQLserver –Group:Name servers, Oracle, Access... u Note: Lazy contradicts Serializable –If two lazy updates collide, then... reconcile F discard one transaction (or use some other rule) F Ask for human advice u Meanwhile, nodes disagree => –Network DB state diverges: System Delusion

8 Anecdotal Evidence u Update Anywhere systems are attractive u Products offer the feature u It demos well u But when it scales up –Reconciliations start to cascade –Database drifts out of sync (System Delusion) u Whats going on?

9 Outline u Replication strategies –Lazy and Eager –Master and Group u How centralized databases scale –deadlocks rise non-linearly u Replication is unstable on scaleup u A possible solution

10 Simple Model of Waits u TPS transactions per second u Each –Picks Actions records uniformly from set of DBsize records –Then commits About Transactions x Actions/2 resources locked About Transactions x Actions/2 resources locked u Chance a request waits is u Action rate is TPS x Actions Active Transactions Active Transactions TPS x Actions x Action_Time Wait Rate = Action rate x Chance a request waits Wait Rate = Action rate x Chance a request waits u = u 10x more transactions, 100x more waits DBsizerecords Transctions x Actions 2 TPS 2 x Actions 3 x Action_Time TPS 2 x Actions 3 x Action_Time 2 x DB_size 2 x DB_size Transactions x Actions Transactions x Actions 2 x DB_size

11 Simple Model of Deadlocks TPS 2 x Actions 3 x Action_Time TPS 2 x Actions 3 x Action_Time 2 x DB_size TPS x Actions 3 x Action_Time TPS x Actions 3 x Action_Time 2 x DB_size TPS x Actions x Action_Time TPS 2 x Actions 5 x Action_Time TPS 2 x Actions 5 x Action_Time 4 x DB_size 2 u A deadlock is a wait cycle u Cycle of length 2: –Wait rate x Chance Waitee waits for waiter – Wait rate x (P(wait) / Transactions) u Cycles of length 3 are PW 3, so ignored. 10 x bigger trans = 100,000 x more deadlocks 10 x bigger trans = 100,000 x more deadlocks

12 Summary So Far u Even centralized systems unstable u Waits: –Square of concurrency –3rd power of transaction size u Deadlock rate –Square of concurrency –5th power of transaction size Trans Size Concurrency

13 Outline u Replication strategies u How centralized databases scale u Replication is unstable on scaleup F Eager (master & group) F Lazy (master & group & disconnected) u A possible solution

14 Eager Transactions are FAT If N nodes, eager transaction is N x bigger If N nodes, eager transaction is N x bigger –Takes N x longer –10 x nodes, 1,000 x deadlocks – (derivation in paper) u Master slightly better than group u Good news: –Eager transactions only deadlock –No need for reconciliation

15 Lazy Master & Group u Use optimistic concurrency control –Keep transaction timestamp with record –Updates carry old+new timestamp –If record has old timestamp F set value to new value F set timestamp to new timestamp –If record does not match old timestamp F reject lazy transaction –Not SNAPSHOT isolation (stale reads) u Reconciliation: –Some nodes are updated –Some nodes are being reconciled –Some nodes are being reconciled NewTimestamp Write A Write B Write C Commit Write A Write B Write C Commit Write A Write B Write C Commit OID, old time, new value TRID, Timestamp A Lazy Transaction

16 Reconciliation u Reconciliation means System Delusion –Data inconsistent with itself and reality u How frequent is it? u Lazy transactions are not fat –but N times as many –Eager waits become Lazy reconciliations –Rate is: –Assuming everyone is connected TPS 2 x (Actions x Nodes) 3 x Action_Time TPS 2 x (Actions x Nodes) 3 x Action_Time 2 x DB_size

17 Eager & Lazy: Disconnected u Suppose mobile nodes disconnected for a day u When reconnect: –get all incoming updates –send all delayed updates Incoming is Nodes x TPS x Actions x disconnect_time Incoming is Nodes x TPS x Actions x disconnect_time Outgoing is: TPS x Actions x Disconnect_Time Outgoing is: TPS x Actions x Disconnect_Time u Conflicts are intersection of these two sets Action_Time Action_Time Disconnect_Time x (TPS x Actions x Nodes) 2 Disconnect_Time x (TPS x Actions x Nodes) 2DB_size

18 Outline u Replication strategies (lazy & eager, master & group) u How centralized databases scale u Replication is unstable on scaleup u A possible solution –Two-tier architecture: Mobile & Base nodes –Base nodes master objects –Tentative transactions at mobile nodes F Transactions must be commutative –Re-apply transactions on reconnect –Transactions may be rejected

19 Safe Approach u Each object mastered at a node u Update Transactions only read and write master items u Lazy replication to other nodes u Allow reads of stale data (on user request) u PROBLEMS: –doesnt support mobile users –deadlocks explode with scaleup u ?? How do banks work???

20 Two Tier Replication u Two kinds of nodes: –Base nodes always connected, always up –Mobile nodes occasionally connected u Data mastered at base nodes u Mobile nodes –have stale copies –make tentative updates

21 Mobile Node Makes Tentative Updates u Updates local database while disconnected u Saves transactions u When Mobile node reconnects: Tentative transactions re-done as Eager-Master (at original time??) u Some may be rejected –(replaces reconciliation) u No System Delusion.

22 Tentative Transactions u Must be commutative with others –Debit 50$ rather than Change 150$ to 100$. u Must have acceptance criteria –Account balance is positive –Ship date no later than quoted –Price is no greater than quoted Tentative Transactions Transactions at local DB at local DB Updates & Rejects TransactionsFromOthers send Tentative Xacts

23 Refinement: Mobile Node Can Master Some Data u Mobile node can master private data –Only mobile node updates this data –Others only read that data u Examples: –Orders generated by salesman –Mail generated by user –Documents generated by Notes user.

24 Virtue of 2-Tier Approach u Allows mobile operation u No system delusion u Rejects detected at reconnect (know right away) u If commutativity works, –No reconciliations –Even though work rises as (Mobile + Base) 2

25 Outline u Replication strategies (lazy & eager, master & group) u How centralized databases scale u Replication is unstable on scaleup u A possible solution (two-tier architecture) –Tentative transactions at mobile nodes –Re-apply transactions on reconnect –Transactions may be rejected & reconciled u Avoids system delusion