Download presentation
Presentation is loading. Please wait.
1
1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI
2
2 Write-Many Replicated Database separation All replicas agree on –which update tx commit –their commit order Total order –Determined by middleware –Followed by each replica durability Replica 3 Tx A Tx B durability Replica 2 durability Replica 1
3
3 Tx B durability Replica 3 Replication MW (global ordering) Tx A A B Order Determined Outside DB Tx A Tx B One Replica durability Replica 2 A B durability Replica 1 A B
4
4 Middleware Commit order: A B Database durability Replica Proxy Tx A Tx B SQL interface Task A Task B BA Cannot commit A & B concurrently! Enforce External Commit Order Must serialize
5
5 Middleware Commit order: A B Database durability Replica Proxy Tx A Tx B SQL interface Task A Task B AB Enforce Order = Serial Commit Serialization slow
6
6 Commit Serialization is Slow Solutions Durability A Proxy Database durability CPU Middleware order: A B C Commit order A B C Durability A B CPU Durability A B C CPU Commit ACommit BCommit C Ack A Ack B Ack C Root cause: Durability & ordering separated serial disk writes Root cause: Durability & ordering separated serial disk writes
7
7 1-Pass order info to DB durability Replica durability Replica Middleware (ordering) order 2-Move durability to MW order Solution: Unite Durability & Ordering Replica Middleware (ordering) durability OFF durability OFF durability Unite in DB
8
8 1- Unite Dur. & Ord. in Database Solutions Proxy Database durability CPU Middleware order: A B C Commit order A B C Durability A B C Ack A Ack B Ack C Commit A at 1 Commit B at 2 Commit C at 3 order Solution 1: pass order info to DB Durability & ordering in database group commit Solution 1: pass order info to DB Durability & ordering in database group commit
9
9 1-Pass order info to DB durability Replica durability Replica Middleware (ordering) order 2-Move durability to MW order Solution: Unite Durability & Ordering Replica Middleware (ordering) durability OFF durability OFF durability Unite in DB
10
10 Commit ACommit BCommit C Ack A Ack B Ack C 2- Unite D. & O. in Middleware Roadmap Proxy Database CPU Middleware order: A B C Commit order A B C CPU Durability A B C CPU durability OFF durability Solution 2: move durability to MW Durability & ordering in middleware group commit Solution 2: move durability to MW Durability & ordering in middleware group commit
11
11 Durability & ordering –Separated serial commit slow –United group commit fast Two Implementations –Tashkent-API: united in DB –Tashkent-MW: united in MW Tashkent-MW –Implementation –Recovery –Performance Roadmap
12
12 Tx B Replication MW (global ordering) Tx A A B C Tashkent-MW Tx A Tx B One Replica durability OFF Replica 2 durability OFF Replica 1 A B C durability A B C Tx C Replica 3 A B C Tx C durability OFF
13
13 Middleware logs tx effects –Durability of update tx Guaranteed in middleware Turn durability off at database Middleware performs durability & ordering –United group commit fast Database commits update tx serially –Commit = quick main memory operation Tashkent-MW Durability & Ordering in Middleware Back to Example
14
14 Replication MW (global ordering) Recovery in Tashkent-MW Db i/o Replica 2 Replica 1 durability Replica 3 durability OFF durability OFF durability OFF
15
15 Database Standard Database I/O DB recovery Disk Memory DataLog Data Log flushed for 1- Durability 2- Allow cleaning dirty data pages: { physical integrity } A A Crash! Tx A A bad
16
16 Database Database I/O with Durability=off DB recovery Disk Memory DataLog Data Simple Solution Recover from a data dump (checkpoint) A A Crash! Tx A Middleware order: A B C Durability A A bad
17
17 Durability & ordering –Separated serial commit slow –United group commit fast Two Implementations –Tashkent-API: united in DB –Tashkent-MW: united in MW Tashkent-MW –Implementation –Recovery –Performance Roadmap
18
18 Performance - Setup Metrics: –Throughput –Response time Workload: –AllUpdates: tx = { 1 update }, mix= %100 updates –TPC-B: tx={4 update,1 read}, mix=%100 updates –TPC-W: mix of long & short txs System configuration: –Linux Cluster running PostgreSQL AllUpdates TH
19
19 AllUpdates Throughput Throughput
20
20 AllUpdates Throughput
21
21 AllUpdates Throughput RT
22
22 AllUpdates Response Time In paper
23
23 In the Paper Design & Implementation –Tashkent-API Performance results –TPC-B & TPC-W –Recovery times –Another I/O subsystems Conclusions
24
24 Conclusions Durability & ordering –Separated serial commit slow –United group commit fast Two Implementations –Tashkent-API: united in DB –Tashkent-MW: united in MW Tashkent-MW system –Pure middleware replication –Significant performance improvement
25
25
26
26
27
27 Concurrency Control Generalized Snapshot Isolation – GSI Conclusions valid whenever replicas agree 1- on which update transactions commit 2- on their commit order Example (bank database) –T1: set balance = $1000 –T2: set balance = $2000 –Replica1: see T1 then T2 balance = $2000 –Replica2: see T2 then T1 balance = $1000
28
28 Durability and Ordering 1/2 Replica 1 Certifier T4 T9 Proxy Database Cert. Log: T4 T9 Scalability problem: one write per trans. DB1 Log: T4 T9
29
29... Replica 1 Certifier T4 T9 Proxy Database Replica 2 Proxy Database T3 T8... Ti’s DB1 Log: T1 T2 T3 T4 T5 T6 T7 T8 T9 One disk write Scalability problem: two writes per trans. Durability and Ordering 2/2 Cert. Log: T1 T2 T3 T4 T5 T6 T7 T8 T9 DB1 Log: T1,T2,T3 T4 T5, T6, T7, T8 T9
30
30 AllUpdates 1-Replica Throughput low replication overhead, 1-replica == standalone DB
31
31 AllUpdates Response Time In paper
32
32 TPC-B Throughput Low replication overhead, 1-replica system == standalone DB, Performance scales with multiple replicas In the Paper
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.