Presentation is loading. Please wait.

Presentation is loading. Please wait.

Quantifying Isolation Anomalies Alan Fekete Shirley Goldrei Jorge Perez Asenjo At VLDB’09, Lyon, August 2009.

Similar presentations


Presentation on theme: "Quantifying Isolation Anomalies Alan Fekete Shirley Goldrei Jorge Perez Asenjo At VLDB’09, Lyon, August 2009."— Presentation transcript:

1 Quantifying Isolation Anomalies Alan Fekete Shirley Goldrei Jorge Perez Asenjo At VLDB’09, Lyon, August 2009

2 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

3 Isolation Transactions should be ACID –Atomic –Consistent –Isolated –Durable Isolation achieved by concurrency control

4 Two phase locking Set commit duration read and write locks before accessing items –Also locks on indices (eg next key locking) to prevent phantoms Guarantees serializable execution –Therefore, every integrity constraint will be preserved in the data BUT: performance can suffer a lot

5 Weaker isolation Traditional dbms offer application developer a choice among isolation levels –Eg Read Committed: read locks are held only till the read is completed, not till transaction completion

6 The tradeoff Serializable Weaker isolation Data integrity, poor performance Loss of data integrity, better performance

7 Quantification We know how to capture performance gains in numbers –Eg Txn Per Minute –Standard benchmarks –Analytical models But how to similarly understand the loss of data integrity from weak isolation?

8 A Microbenchmark Goal: explore the design space –How configuration, isolation mechanism etc impact on the amount of data corruption Not: be realistic Not: compare platforms

9 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

10 Style Like performance benchmarks –A particular schema, random data of given size –A mix of particular transaction types –Run concurrent clients, submitting transactions of randomly chosen types with random parameters –Warmup, then measurement interval, then collect data What to measure is different: the number of cases where integrity constraint is violated, in the data state after the run

11 The Data TableA(int id, int valueA, varchar(100) description) TableB(int id, int valueB, varchar(100) description) Integrity: for given id, TableA.valueA+TableB.valueB between 0 and 99 inclusive

12 The transaction types ChangeA (id): –Read TableA.valueA –Long pause –Read TableB.valueB –Long pause –Update TableA.valueA, adding delta If sum seen as 0..49, delta = 50 If sum seen as 50..99, delta = -50 If sum violates integrity, or during warmup, delta = 0 Similar ChangeB(id), ChangeAB(id) –ChangeAB applies delta/2 to TableA.valueA, delta/2 to TableB.valueB

13 Configuration choices Which Isolation level –Especially SI (Snapshot isolation) and RC_MV (multiversion with each select seeing the most recent data that had committed at the time when the select statement was executed) MPL Size of hot-spot with contention on data Mix of transaction types Length of pauses

14 Complexities If integrity violation is seen, make no change –Otherwise, hard to understand situation at low isolation levels, where inconsistent reads can lead to unjustified appearance of violation Runs short enough to limit impact of cases of no change –But long enough to give reasonable confidence interval on measurements –Solution: super-runs!

15 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

16 Simplistic model Probability that a given transaction introduces an anomaly –Chance of bad overlap with another transaction

17 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

18 Violation Rate In our study, the main dependent variable (Y-axis in graphs) is violation rate –The number of rows where integrity condition is false at the end of the run, divided by the number of committed transactions that were performed –Typical values around 1%

19 Violation Rate vs MPL Predictions from our model Measured

20 Violation rate vs Txn Mix

21 Violation Rate vs Pauses

22 Trends Violation rate proportional to (MPL-1) –So number of violations proportional to MPL 2 Violation rate proportional to 1/hotspot At SI, violations only from A vs B At RC_MV, violations from other combinations as well At RC_MV, more violations with longer pause from read(B) to updates

23 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

24 Conventional Wisdom SI is a rather strong isolation level, and RC_MV is rather weak More violations of integrity at RC_MV than at SI –Oracle etc use SI for serializable, and RC_MV for Read Committed But it is known that RC_MV is not strictly weaker than SI –There exist interleavings which are serializable under RC_MV (read latest committed data) but not when executed under SI (read data committed at time txn started)

25 An inversion Configuration found from predictive model

26 Overview The Issue The Microbenchmark Predicting the rate of integrity violation Validating the Predictions Finding an Inversion Conclusion

27 We can understand the extent of integrity loss from weak isolation –Quantitative –Trends with configuration parameters Further work: include phantoms Further work: extend predictive model to other transaction types and data schemas


Download ppt "Quantifying Isolation Anomalies Alan Fekete Shirley Goldrei Jorge Perez Asenjo At VLDB’09, Lyon, August 2009."

Similar presentations


Ads by Google