Download presentation
Presentation is loading. Please wait.
Published byTyrell Bowdle Modified over 9 years ago
1
Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto, Canada
2
Transactional Memory Programming Paradigm Each thread executing a parallel region: Announces start of a transaction Executes operations on shared objects Attempts to commit the transaction If no data race, commit succeeds, operations take effect Otherwise commit fails, operations discarded, transaction restarted Simpler than locking!
3
Transactional Memory Used in multiprocessor platforms Our work: the first TM implementation on a cluster Supports both SQL and parallel scientific applications (C++)
4
TM in a Multiprocessor Node Multiple physical copies of data High memory overhead A Copy of A T1: Read(A) T2: Write(A) T1: Active T2: Active
5
TM on a Cluster Key Idea 1. Distributed Versions Different versions of data arise naturally in a cluster Create new version on different node, others read own versions writeread
6
Exploiting Distributed Page Versions mem0 txn0 mem1 txn1 mem2 txn2 memN txnN network... Distributed Transactional Memory (DTM) v3v2v1v0
7
Key Idea 2: Concurrent “Snapshots” Inside Each Node read v1 v2 Txn0 (v1) Txn1 (v2)
8
Key Idea 2: Concurrent “Snapshots” Inside Each Node read v1 v2 Txn0 (v1) Txn1 (v2) v1 v2
9
Key Idea 2: Concurrent “Snapshots” Inside Each Node read v1 v2 Txn0 (v1) Txn1 (v2) v1 v2
10
Distributed Transactional Memory A novel fine-grained distributed concurrency control algorithm Low memory overhead Exploits distributed versions Supports multithreading within the node Provides 1-copy serializability
11
Outline Programming Interface Design Data access tracking Data replication Conflict resolution Experiments Related work and Conclusions
12
Programming Interface init_transactions() begin_transaction() allocate_dtmemory() commit_transaction() Need to declare TM variables explicitly
13
Data Access Tracking DTM traps reads and writes to shared memory by either one of: Virtual memory protection Classic page-level memory protection technique Operator overloading in C++ Trapping reads: conversion operator Trapping writes: assignment ops (=, +=, …) & increment/decrement(++/--)
14
Data Replication …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n
15
Twin Creation …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n Wr p1 P1 Twin
16
Twin Creation …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n Wr p2 P1 Twin P2 Twin
17
Diff Creation …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n
18
Broadcast of the Modifications at Commit …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n Diff broadcast (vers 8) Latest Version = 7 v2v1
19
Other Nodes Enqueue Diffs …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n Diff broadcast (vers 8) v2v1v8 v1 Latest Version = 7
20
Update Latest Version …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n v2v1v8 v1 Latest Version = 7Latest Version = 8
21
Other Nodes Acknowledge Receipt …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n v2v1 v8v1 Ack (vers 8) v8 Latest Version = 7Latest Version = 8
22
T1 Commits …… Page 1 Page 2 Page n T1(UPDATE) …… Page 1 Page 2 Page n v2v1 v8v1 v8 Latest Version = 8
23
Lazy Diff Application... Page 1 V0 Page 2 V0 V8V1 Page N V3 V5V4 T2(V2): Rd(…, P1, P2) Latest Version = 8 V2V1V8
24
Lazy Diff Application... Page 1 Page 2 V0 Page N V3 V5V4 V8 V2 V8V1 T2(V2): Rd(…, P1, P2) Latest Version = 8
25
Lazy Diff Application... Page 1 V2 V8 Page 2 V1 V8 Page N V3 V5V4 T2(V2): Rd(…, P1, P2) Latest Version = 8
26
Lazy Diff Application... Page 1 V2 V8 Page 2 V1 V8 Page N V3 V5V4 T3(V8): Rd(PN) T2(V2): Rd(…, P1, P2) Latest Version = 8
27
Lazy Diff Application... Page 1 V2 V8 Page 2 V1 V8 Page N V5 T3(V8): Rd(PN) T2(V2): Rd(…, P1, P2) Latest Version = 8
28
Waiting Due to Conflict T3(V8): Rd(PN, P2)... Page 1 V2 V8 Page 2 V1 V8 Page N V5 T2(V2): Rd(…, P1, P2) Wait until T2 commits Latest Version = 8
29
Transaction Abort Due to Conflict... Page 1 Page 2 V0 Page N V3 V5V4 V8 V2 V8V1 T3(V8): Rd(P2) T2(V2): Rd(…, P1, P2) Latest Version = 8
30
Transaction Abort Due to Conflict... Page 1 Page 2 V8 Page N V3 V5V4 V8 V2 T3(V8): Rd(P2) CONFLICT! T2(V2): Rd(…, P1, P2) Latest Version = 8
31
Write-Write Conflict Resolution Can be done in two ways Executing all updates on a master node, which enforces serialization order OR Aborting the local update transaction upon receiving a conflicting diff flush More on this in the paper
32
Experimental Platform Cluster of Dual AMD Athlon Computers 512 MB RAM 1.5GHz CPUs RedHat Fedora Linux OS
33
Benchmarks for Experiments TPC-W e-commerce benchmark Models an on-line book store Industry-standard workload mixes Browsing (5% updates) Shopping (20% updates) Ordering (50% updates) Database size of ~600MB Hash-table micro-benchmark (in paper)
34
Application of DTM for E-Commerce
35
We use a Transactional Memory Cluster as the DB Tier
36
Cluster Architecture
37
Implementation Details We use MySQL’s in-memory HEAP tables RB-Tree main-memory index No transactional properties Provided by inserting TM calls Multiple threads running on each node
38
Baseline for Comparison State-of-the-art Conflict-aware protocol for scaling e-commerce on clusters Coarse grained (per-table) concurrency control (USITS’03, Middleware’03)
39
Throughput Scaling
40
Fraction of Aborted Transactions # of slavesOrderingShoppingBrowsing 11.15%1.44%0.63% 20.35%2.27%1.34% 40.07%1.70%2.37% 60.02%0.41%2.07% 80.00%0.22%1.59%
41
Comparison (browsing)
42
Comparison (shopping)
43
Comparison (ordering)
44
Related Work Distributed concurrency control for database applications Postgres-R(SI), Wu and Kemme (ICDE’05) Ganymed, Plattner and Alonso (Middleware’04) Distributed object stores Argus (’83), QuickStore (’94), OOPSLA’03 Distributed Shared Memory TreadMarks, Keleher et al. (USENIX’94) Tang et al. (IPDPS’04)
45
Conclusions New software-only transactional memory scheme on a cluster Both strong consistency and scaling Fine-grained distributed concurrency control Exploits distributed versions, low memory overheads Improved throughput scaling for e- commerce web sites
46
Questions?
47
Backup slides
48
Example Program #include typedef struct Point { dtm_int x; dtm_int y; } Point; init_transactions(); for (int i = 0; i < 10; i++) { begin_transaction(); Point * p = allocate_dtmemory(); p->x = rand(); p->y = rand(); commit_transaction(); }
49
Query weights
50
Decreasing the fraction of aborts
51
Micro benchmark experiments
52
Micro benchmark experiments (with read-only optimization)
53
Fraction of aborts # of machines1246810 % aborts00.571.692.944.055.08
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.