1 Chapter 2. Universal Tuning Considerations Spring 2002 Prof. Sang Ho Lee School of Computing, Soongsil University

Slides:



Advertisements
Similar presentations
1 Integrity Ioan Despi Transactions: transaction concept, transaction state implementation of atomicity and durability concurrent executions serializability,
Advertisements

Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Lock-Based Concurrency Control
Lock Tuning. overview © Dennis Shasha, Philippe Bonnet 2001 Sacrificing Isolation for Performance A transaction that holds locks during a screen interaction.
Log Tuning. AOBD 2007/08 H. Galhardas Atomicity and Durability Every transaction either commits or aborts. It cannot change its mind Even in the face.
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
1 Transaction Management Database recovery Concurrency control.
H.Lu/HKUST L07: Lock Tuning. L07-Lock Tuning - 2 H.Lu/HKUST Lock Tuning  The key is to combine the theory of concurrency control with practical DBMS.
Database Management Systems I Alex Coman, Winter 2006
Computer Organization and Architecture
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Managing Multi-User Databases AIMS 3710 R. Nakatsu.
Database Management System Module 5 DeSiaMorewww.desiamore.com/ifm1.
Transactions Sylvia Huang CS 157B. Transaction A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
Transaction Lectured by, Jesmin Akhter, Assistant professor, IIT, JU.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 15: Transactions.
ITEC 3220M Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: 3220m.htm
1 Performance Tuning Next, we focus on lock-based concurrency control, and look at optimising lock contention. The key is to combine the theory of concurrency.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
© Dennis Shasha, Philippe Bonnet 2001 Log Tuning.
Concurrency Control in Database Operating Systems.
11/7/2012ISC329 Isabelle Bichindaritz1 Transaction Management & Concurrency Control.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
©Silberschatz, Korth and Sudarshan15.1Database System Concepts Chapter 15: Transactions Transaction Concept Transaction State Implementation of Atomicity.
Transactions and Locks A Quick Reference and Summary BIT 275.
Chapter 15: Transactions Loc Hoang CS 157B. Definition n A transaction is a discrete unit of work that must be completely processed or not processed at.
Chapter 20 Transaction Management Thomas Connolly, Carolyn Begg, Database System, A Practical Approach to Design Implementation and Management, 4 th Edition,
Transactions. Transaction: Informal Definition A transaction is a piece of code that accesses a shared database such that each transaction accesses shared.
15.1 Transaction Concept A transaction is a unit of program execution that accesses and possibly updates various data items. E.g. transaction to transfer.
©Silberschatz, Korth and Sudarshan14.1Database System Concepts - 6 th Edition Chapter 14: Transactions Transaction Concept Transaction State Concurrent.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Oracle Architecture - Structure. Oracle Architecture - Structure The Oracle Server architecture 1. Structures are well-defined objects that store the.
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
Lock Tuning. Overview Data definition language (DDL) statements are considered harmful DDL is the language used to access and manipulate catalog or metadata.
SYSTEMS IMPLEMENTATION TECHNIQUES TRANSACTION PROCESSING DATABASE RECOVERY DATABASE SECURITY CONCURRENCY CONTROL.
1 An Introduction to Database Tuning Spring 2002 Prof. Sang Ho Lee School of Computing, Soongsil Univ.
Database Recovery Techniques
Jonathan Walpole Computer Science Portland State University
Transactions.
Managing Multi-User Databases
Module 11: File Structure
Transaction Management
Transaction Management and Concurrency Control
Database Management System
Transactions.
Transaction Management
Ch 21: Transaction Processing
Transactions Sylvia Huang CS 157B.
Transactions.
Chapter 10 Transaction Management and Concurrency Control
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Transaction management
Transactions and Concurrency
Chapter 14: Transactions
Database System Architectures
Transation Management
UNIT -IV Transaction.
Presentation transcript:

1 Chapter 2. Universal Tuning Considerations Spring 2002 Prof. Sang Ho Lee School of Computing, Soongsil University

2 Tuning the Guts Concurrency control --- How to minimize lock contention Recovery and logging --- How to minimize logging and dumping overhead Operating system --- How to optimize buffer size, process (thread) scheduling and so on Hardware --- How to allocate disks, random access memory, and processors

3 Common Underlying Components of All Database Systems Concurrency Control Recovery Subsystem Operating System Processor(s), Disk(s), Memory

4 Concurrency Control Goals Performance goal: Reduce Blocking --- one transaction waits for another to release its locks (should be occasional at most) Deadlock --- a set of transactions in which each transaction is waiting for another one in the set to release its locks (should be rare at most)

5 Implications of Performance Goal Ideal transactions (units of work) Acquire few locks and prefer read locks over write locks (reduce conflicts) Locks acquired have ‘fine granularity’ (i.e. lock few bytes to reduce conflicts) Held for little time (reduce waiting)

6 Correctness Goal Goal: Guarantee that each transaction appears to execute in isolation (degree 3 isolation) Underlying assumption: If each transaction appears to execute alone, then execution will be correct Implication: Tradeoff between performance and correctness

7 Isolation degree

8 Example: Simple Purchases Consider the code for a purchase application of item i for price p If cash < p, then roll back transaction Inventory(i) := inventory(i) + p cash := cash – p

9 Purchase Example – Two Applications Purchase application program executions are P1 and P2 P1 has item i with price 50 P2 has item j with price 75 Cash is 100 If purchase application is a single transaction, one of P1 and P2 will roll back

10 Purchase Application – Concurrent Anomalies By contrast, if each bullet in page 7 is a transaction, then following bad execution can occur P1 checks that cash > 50. It is P2 checks that cash > 75. It is P1 completes. Cash = 50 P2 completes. Cash = -25

11 Purchase Application – Solution Orthodox solution: Make whole programs a single transaction Implication: Cash becomes a bottleneck Chopping solution: Find a way to rearrange and then chop up the programs without violating degree 3 isolation properties

12 Orthodox Solution Begin transaction If cash < p, then roll back transaction; Inventory(i) := inventory(i) + p; cash := cash – p; Commit transaction Transaction should acquire an exclusive lock on “cash” from the beginning to avoid deadlock If “cash” was locked during the duration of the transaction and there were one disk access (say 20ms), then throughput would be limited to approximately 50 transactions per second.

13 Purchase Application – Chopping Solution Rewritten purchase application. Each bullet is a separate transaction If cash < p, then roll back application cash := cash – p Inventory(i) = inventory(i) + p “cash” is no longer a bottleneck When can chopping works? (See appendix of the textbook for details)

14 When to Chop transactions (1) Whether or not a transaction T may be broken up into smaller transactions depends partially on what is concurrent with T Will the transactions that are concurrent with T cause T to produce an inconsistent state or to observe an inconsistent value if T is broken up? Will the transactions that are concurrent with T be made inconsistent if T is broken up? A rule of thumb in chopping transaction Transaction T accesses data X and Y, but any other transaction T’ accesses at most one of X or Y Then T can often be divided into two transactions

15 When to Chop transactions (2) T1T1 T2T2 T3T3 XY T 1 accesses X and Y T 2 accesses X T 3 accesses Y Splitting T 1 into Two transactions, one accessing X and one accessing Y, will not disturb consistency of T 2 or T 3

16 Recovery Hacks for Chopping Must keep track of which transaction piece has completed in case of a failure Suppose each user C has a table UserX As part of first piece, perform insert into UserX(i, p, ‘piece 1’), where i is the inventory item and p is the price As part of second piece, perform insert into UserX(i, p, ‘piece 2’) Recovery requires re-executing the second pieces of inventory transactions whose first pieces have finished

17 Lock Tuning Should Proceed along Several Fronts Eliminate locking when it is unnecessary Take advantage of transactional context to chop transactions into small pieces Weaken isolation guarantees when the application allows it Use special system facilities for long reads Select the appropriate granularity of locking Change your data description data during quiet periods only Think about partitioning Circumvent hot spots Tune the deadlock interval

18 Sacrificing Isolation for Performance A transaction that holds locks during a screen interaction is an invitation to bottlenecks Airline reservation Retrieve list of seats available Talk with customer regarding availability Secure seat Performance would be intolerably slow if this were a single transaction, because each customer would hold a lock on seats available.

19 Solution Make first and third steps transactions Keep user interaction outside a transactional context Problem: Ask for a seat, but then find it is unavailable Possible but tolerable

20 Data Description Language Statements are Considered Harmful DDL is a language used to access and manipulate the system catalog Catalog data must be accessed by every transaction As a result, the catalog can easily become a hot spot and therefore a bottleneck General recommendation is to avoid updates to the system catalog during heavy system activity, especially you are using dynamic SQL

21 Circumvent Hot Spots A hot spot is a piece of data that is accessed by many transactions and is updated by some Three techniques for circumvent hot spots Use partitioning to eliminate it Access the hot spot as late as possible in the transaction Use special database management facilities

22 Low Isolation Counter Facility (1) Consider an application that assigns sequential keys from a global counter, e.g. each new customer must contain a new identifier The counter can become a bottleneck, since every transaction locks the counter, increments it and retains the lock until the transaction ends Oracle offers facility (called SEQUENCE) to release counter lock immediately after increment Reduces lock contention, but a transaction abort may cause certain counter numbers to be unused.

23 Low Isolation Counter Facility (2) – Effect of Caching The Oracle counter facility can still be a bottleneck if every update is logged to disk So, Oracle offers the possibility to cache values of the counter. If 100 values are cached for instance, then the on-disk copy of the counter number is updated only once in every 100 times In an experiment of bulk loads where each inserted record was to get a different counter value, the load time to load 500,000 records went from 12.5 minutes to 6.3 minutes when the cache was increased from 20 to 1000.

24 Select Proper Granularity Use record level locking if few records are accessed (unlikely to be on same page) Use page or table level locking for long update transactions. Fewer deadlocks and less locking overhead In INGRES, possible to choose between page and table locks with respect to a given table within a program or DB startup file

25 Load Control Allowing too many processes to access memory concurrently can result in memory thrashing in normal time-sharing systems Analogous problem in database systems if there are too many transactions and many are blocked due to locks Gerhard Weikum and his group at ETH have discovered the following rule of thumb: If more than 23% of the locks are held by blocked transactions at peak conditions, then the number of transactions allowed in the system is too high

26 Recovery Subsystem Logging and recovery are pure overhead from point of view of performance, but necessary for applications that update data and require fault tolerance Tuning the recovery subsystem entails: Managing the log Choosing a data buffering policy Setting checkpoint and database dump intervals

27 Recovery Concept Goal of recovery The effects of committed transactions should be permanent Transactions should be atomic That is, even in the face of failures, the effect of committed transactions should permanent; aborted transactions should leave no trace Two kinds of failures: main memory failures, disk failures Active Commit Abort States of transactions: Once a Transaction enters commit or abort it cannot change its mind

28 What about Software? Trouble is that most systems stoppages these days are due to software. Tandom reports under 10% remaining hardware failures (J. Gray, 1990) Nearly 99% of software bugs are “Heisenbugs” (One sees them once and then never again) (study on IBM/IMS, 1984) Because of some unusual interaction between different components Hardware-oriented recovery facilities (such as mirrored disks, dual-powered controllers, dual-bus configurations, back-up processors) are useful for Heisenbugs

29 Logging Principles Log --- informally, a record of the updates caused by transactions. Must be held on durable media (e.g. disks) A transaction must write its updates to a log before committing. That is, before transaction ends. Result: hold committed pages even if main memory fails Sometime later, those updates must be written to the database disks How much later is a tuning knob.

30 Logging Principles (2) Buffer Database Disk Log Write just before commit Write after commit Stable Unstable

31 Managing the Log Writes to the log occur sequentially Writes to disk occur (at least) 10 times faster when they occur sequentially than when they occur randomly Conclusion: A disk that has the log should have no other data Better reliability: separation makes log failures independent of database disk failures

32 Buffered Commit Writes of after images to database disks will be random They are not really necessary from point of view of recovery Unbuffered commit strategy write the after images into the database after committing recovery from random access memory failures is fast Conclusion: Good tuning option is to do “buffered commit”. Net effect: Wait until writing an after image to the database costs no seeks

33 Buffered Commit – Tuning Considerations Must specify buffered commit,e.g. Fast_Commit option in INGRES Default in most systems Buffer and log both retain all items that have been written to the log but not yet to the database This can consume more free pages in buffer. Regulate frequency of writes from buffer to database by number of WRITE_BEHIND threads (processes) in INGRES, by DBWR parameters in Oracle Costs: buffer and log space

34 Group Commits If the update volume is EXTREMELY high, then performing a write to the log for every transaction may be too expensive, even in the absence of seeks One way to reduce the number of writes is to write the updates of several transactions together in one disk write This reduces the number of writes at the cost of increasing the response time (since the first transaction in a group will not commit until the last transaction of the group)

35 Batch to Mini-Batch Consider an update-intensive batch transaction If concurrent contention is not an issue, then it can be broken up into short transactions known as mini-batch transactions Example Transaction that updates, in sorted order, all accounts that had activity on them in a given day Break up to mini-transactions each of which accesses 10,000 accounts and then updates a global counter. Easy to recover, Doesn’t overfill the buffer Caveat: Because this transformation is a form of transaction chopping, you must ensure that you maintain any important isolation guarantees.

36 Mini-Batch Example Suppose there are 1 million account records and 100 transactions, each of which takes care of 10,000 records Transaction 1: Update account records 1 to 10,000 global.count := 10,000 Transaction 2: Update account records 10,001 to 20,001 global.count := 20,000 And so on.

37 Operating System Considerations Scheduling and priorities of processes (threads) of control Size of virtual memory for database shared buffers

38 Processes (threads) Switching control from one process to another is expensive on some systems (about 1,000 instructions) Run non-preemptively or with long (say 1 second) timeslice Watch priority: Database system should not run below priority of other applications Avoid priority inversion

39 Priority Inversion Suppose that transaction T 1 has the highest priority, followed by T 2 (which is at the same priority as several other transactions), followed by T 3. T1T1 T2T2 T3T3 Lock X Request X T 1 waits for lock that only T 3 can release T 2 runs instead

40 Buffer Memory — Definitions Buffer memory — shared virtual memory (RAM + disk) Logical access — a database management process read or write call Physical access — a logical access that is not served by the buffer Hit ratio — portion of logical accesses satisfied by the buffer

41 Database Buffer Buffer (Virtual Memory) Database Processes Disks Ram + Paging Disk Buffer is in virtual memory, though its greater part should be in random access memory

42 Buffer Memory — Tuning Principles Buffer too small, then hit ratio too small Some systems offer facility to see what hit ratio would be if buffer were larger, e.g. X$KCBRBH table in Oracle. Disable in normal operation to avoid overhead Buffer is too large, then hit ration may be high, but virtual memory may be larger than RAM allocation resulting in paging Recommended strategy: Increase the buffer size until the hit ratio flattens out. If paging, than buy memory

43 Scenario 1 Many Scans are performed Disk utilization is high (long disk access queues) Processor and network utilization is low Execution is too slow, but management refuses to buy disks

44 Scenario 1: What’s Wrong? Clearly, I/O is the bottleneck. Possible reasons: Load is intrinsically heavy --- buy a disk with your own money Data is badly distributed across the disks, entailing unnecessary seeks Disks accesses fetch too little data Pages are underutilized

45 Scenario 1: an Approach Since processor and network utilization is low, we conclude that the system is I/O-bound Reorganize files to occupy contiguous portions of disk Raise pre-fetching level Increase page utilization

46 Scenario 2 A new credit card offers large lines of credit at low interest rates Set up transaction has three steps: 1.Obtain a new customer number from a global counter 2.Ask the customer for certain information, e.g., income, mailing address 3.Install the customer into the customer table The transaction rate cannot support the large insert traffic

47 Scenario 2: What’s Wrong? Lock contention is high on global counter In fact, while one customer is entered, no other customer can even be interviewed

48 Scenario 2: Action Conduct interview outside a transactional context Obtain lock on global counter as late as possible or use special increment facility if available Lock held on counter only while it is incremented

49 Scenario 3 Accounting department wants average salary by department Arbitrary updates may occur concurrently Slow first implementation: Begin transaction SELECT dept, avg (salary) as avgsalary FROM employee GROUP BY dept End transaction

50 Scenario 3: What’s Wrong? Lock contention on employee tuples either causes the updates to block or the scan to abort Check deadlock rate and lock queues Buffer contention causes the sort that is part of the group by to be too slow Check I/O needs of grouping statement

51 Scenario 3: Action Partition in time or space. In descending order of preference: 1.Pose this query when there is little update activity, e.g., at night 2.Execute this query on a slightly out-of-date copy of the data 3.If your database management system has a facility to perform read-only queries without obtaining locks then use it 4.Use degree 1 or 2 isolation; get approximate result

52 Scenario 4 Three new pieces of knowledge: 1.The only updates will be updates to individual salaries; No transaction will update more than one record 2.The answer must be consistent 3.The query must execution on up-to-date data What does this change?

53 Scenario 4: Action Degree2 isolation will give equivalent of degree 3 isolation in this case The reason is that each concurrent update transaction accesses only a single record, so the degree 2 grouping query will appear to execute serializably with each update