Download presentation
Presentation is loading. Please wait.
Published byDennis Glenn Modified over 9 years ago
1
Authors: Stavros Harizopoulos @ HP Daniel J. Abadi @ Yale Samuel Madden @ MIT Michael Stonebraker @ MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod 2008 paper OLTP through the looking glass, and what we found there
2
What is Online Transaction Processing (OLTP)? OLTP, refers to a class of systems that facilitate and manage transaction-oriented applications, typically for data entry and retrieval transaction processing (wiki) OLTP database was optimized for computer technology of late 1970s, 30 years ago 2B3L
3
Main feature of OLTP Buffer Management to facilitate data transfer between memory and disk B-Tree for on-disk data storage Logging for recovery Locking to support concurrency Latching for accessing shared data structure 2B3L
4
Motivation of the studies Is the OLTP database optimized nowadays, given the hardware advancement? Request from outside the DB community for alternative DB architecture
5
Motivation of studies Hardware advancement 30 Years agoNowadays DB costIn millionsFew thousands Storage sizeDB size >> Memory Memory > DB size Processing time for most of the transactions \In microseconds
6
Motivation of studies Request from outside the DB community “database-like” storage system proposal from Operating System and networking conference – varying forms of concurrency, consistency, reliability, replication, queryability
7
Trends in OLTP – (1/5) Cluster Computing
8
Trend in OLTP-(2/5) Memory resident Databases Buffer
9
Trend in OLTP-(2/5) Memory resident Databases Data doesn’t grow as fast as memory size
10
Trend in OLTP-(3/5) Single Threading in OLTP System
11
A step backward from multithread to single thread ? Why multithreading? – Prevent idle of CPU while waiting data from disk – Prevent long-running transactions from blocking short transaction Not valid for memory resident DB – No disk wait – Long-running transactions run in ware house Trend in OLTP-(3/5) Single Threading in OLTP System
12
What about multi processors ? – Achieve shared-nothing processor by virtual machine What about network disk? – Feasible to partition transaction to run in “single- site” Trend in OLTP-(3/5) Single Threading in OLTP System
13
Trend in OLTP-(4/5) High Availability vs. Logging 24x7 service achieved by using multiple set of hardware. Perform recovery by copying missing states from other database replicas. Log for recovery can be avoided
14
Trend in OLTP-(4/5) High Availability vs. Logging Production ServerStandby Server Recovery from 24x7 service
15
Trend in OLTP-(5/5) Transaction Variants Why transaction variants? – 2 phase commit protocol harm large scale distributed DB system performance – 2 phase commit involves commit-request and commit phase which involves all server to participate.
16
Trend in OLTP-(5/5) Transaction Variants Trade consistency with performance Eventual consistency, all writes propagate among the database servers.
17
Trend in OLTP-(5/5) Transaction Variants Eventual consistency example: Amazon Sx, Sy, Sz are different servers D1, D2 add item to cart D3 delete an item from cart D4 add another item to cart
18
Research groups interested in new DB architecture Amazon, HP, NYU, MIT
19
Trend in OLTP - Summary Eventual consistency
20
DBMS modification by OLTP trends (1) memory resident DB can get rid of buffer mgt Buffer
21
DBMS modification by OLTP trends (2) single thread can avoid locking and latching
22
DBMS modification by OLTP trends (3) Cluster computing avoid locking, instead of single processor multithread, each processor is responsible for each own thread
23
DBMS modification by OLTP trends (4) high availability can avoid using log for recovery purpose Production ServerStandby Server Recovery from
24
DBMS modification by OLTP trends (5) transaction less avoid book keeping, i.e. logging
25
Case Study of DBMS, Shore Shore was developed at the University of Wisconsin in the early 1990’s It was designed to be a typed, persistent object system borrowing from both file system and object-oriented database technologies http://www.cs.wisc.edu/shore
26
Basic components of Shore 2B3L
27
Removing Shore’s components
28
Remove Shore’s logging(1) To increase log buffer size so that it will not flush to disk
29
Remove Shore’s locking(2) To configure Lock Manager always granting lock To remove codes that handle ungranted lock request
30
Remove Shore’s latching(3) To add if-else statement to avoid request for latch To replace the original latching intensive B- tree with a latch free B-tree
31
Remove Shore’s Buffer Mgt (4) To replace Buffer Mgt by directly invoking malloc for memory allocation
32
Shore after components removal
33
Bench mark for comparison (TPC-C) TPC-C is industry standard used to measure ecommerce performance TPC-C is designed to represent any industry that must manage, sell, or distribute a product or service Vendors includes Microsoft, Oracle, IBM, Sybase, Sun, HP, DELL etc. http://www.tpc.org/tpcc/default.asp
34
Bench mark for comparison 1 warehouse(~100M) serves 10 districts, and each district serve 3000 customers.
35
Bench mark for comparison 5 concurrent transactions in TPC-C – New Order Transaction – Payment – Deliver Order – Check status of Order – Monitor Stock Level of warehouse
36
Experiment setup and measurement Single-core Pentium 4, 3.2GHz, with 1MB L2 cache, hyper threading disabled, 1GB RAM, running Linux 2.6. 40,000 transactions of types New Order Transaction and Payment are run Results measured in – 1) Throughput (Time/ # Transaction completed) – 2) Instruction count
37
Results after removing the components (in throughput) Memory resident Shore DB provided 640 transactions per second. Stripped-down Shore DB provided 12,700 transactions per second. The Stripped-down Shore DB gave a 20 times improvement in throughput
38
Results after removing the components (in # instruction) Instruction of useful work is only <2% of a memory resident DB
39
Effect of removing different components for payment (1/6)
40
Effect of removing different components for payment (2/6)
41
Effect of removing different components for payment (3/6)
42
Effect of removing different components for payment (4/6)
43
Effect of removing different components for payment (5/6)
44
Effect of removing different components for payment (6/6)
45
Effect of removing different components for both order
46
Instruction and cycle comparsion
47
Implication for future OLTP engine Concurrency Control Single threaded transaction allow concurrency control to be turned off But many DBMS applications are not sufficiently well behave for Single threaded transaction. Dynamic locking was experimentally the best concurrency control with disk. What concurrency control protocol is best?
48
Implication for future OLTP engine Multi-core support Virtualization, each core is a single-threaded machine Intra-query parallelism, each processor running part of a single query
49
Implication for future OLTP engine Replication Management Active-passive replication scheme with log Replica may not be consistent with the primary unless on two-phase commit protocol Log is required Active-active replication scheme with transactions Two-phase commit introduce large latency for distributed replication
50
Implication for future OLTP engine Weak Consistency Eventual Consistency - Data is not immediately propagated across all nodes. To study the effect of different degree of workload to the consistency level of data
51
Implication for future OLTP engine Cache conscious B-Trees Cache misses in the B-tree code may well be the new bottleneck for the stripped down system.
52
Conclusion Most significant overhead contributors are buffer management and locking operation, followed by logging and latching. A fully stripped down system’s performance is orders of magnitude better than an unmodified system.
53
Thank you !
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.