SIGMOD 99 Efficient Concurrency Control in Multidimensional Access Methods Kaushik Chakrabarti Sharad Mehrotra University of.

Slides:



Advertisements
Similar presentations
1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,
Advertisements

1 CS216 Advanced Database Systems Shivnath Babu Notes 12: Concurrency Control (II)
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Managing Hierarchies of Database Elements (18.6) 1 Presented by Sarat Dasika (114) February 16, 2012.
Serializable Isolation for Snapshot Databases Michael J. Cahill, Uwe Röhm, and Alan D. Fekete University of Sydney ACM Transactions on Database Systems.
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Concurrency Control Part 2 R&G - Chapter 17 The sequel was far better than the original! -- Nobody.
Indexing and Range Queries in Spatio-Temporal Databases
Copyright © 2014, Oracle and/or its affiliates. All rights reserved. | InnoDB Spatial Index Jimmy Yang Copyright © 2014, Oracle and/or its affiliates.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#22: Concurrency Control – Part 2 (R&G.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. –Because disk accesses are.
Concurrency Control Nate Nystrom CS 632 February 6, 2001.
Chapter 15 B External Methods – B-Trees. © 2004 Pearson Addison-Wesley. All rights reserved 15 B-2 B-Trees To organize the index file as an external search.
Monday, 08 June 2015Dr. Mohamed Osman1 What is Database Administration A high level function (technical Function) that is responsible for ► physical DB.
Managing Hierarchies of Database Elements (18.6) -Neha Saxena Class Id: 214.
Spatio-Temporal Databases
Concurrency Control Managing Hierarchies of Database Elements (18.6) 1 Presented by Ronak Shah (214) March 9, 2009.
Dynamic Granular Locking Approach to Phantom Protection in R-trees Kaushik Chakrabarti Sharad Mehrotra Department of Computer Science University of Illinois.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
Signature Based Concurrency Control Thomas Schwarz, S.J. JoAnne Holliday Santa Clara University Santa Clara, CA 95053
1 R-Trees for Spatial Indexing Yanlei Diao UMass Amherst Feb 27, 2007 Some Slide Content Courtesy of J.M. Hellerstein.
METU Department of Computer Eng Ceng 302 Introduction to DBMS Disk Storage, Basic File Structures, and Hashing by Pinar Senkul resources: mostly froom.
Spatial Indexing I Point Access Methods.
ICS (072)Concurrency Control1 Transaction Processing and Concurrency Control Dr. Muhammad Shafique Chapter March 2008.
Techniques and Data Structures for Efficient Multimedia Similarity Search.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
GRANULARITY OF LOCKS IN SHARED DATA BASE J.N. Gray, R.A. Lorie and G.R. Putzolu.
Amdb: An Access Method Debugging and Analysis Tool Marcel Kornacker, Mehul Shah, Joe Hellerstein UC Berkeley.
Multimedia Information Systems CS Outlines Introduction to DMBS Relational database and SQL B + - tree index structure.
CS4432: Database Systems II
B+ Review. B+ Tree: Most Widely Used Index Insert/delete at log F N cost; keep tree height- balanced. (F = fanout, N = # leaf pages) Minimum 50% occupancy.
Locking Key Ranges with Unbundled Transaction Services 1 David Lomet Microsoft Research Mohamed Mokbel University of Minnesota.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Indexing for Multidimensional Data An Introduction.
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
School of Information Technologies Michael Cahill 1, Uwe Röhm and Alan Fekete School of IT, University of Sydney {mjc, roehm, Serializable.
Mehdi Kargar Department of Computer Science and Engineering 1.
Querying Structured Text in an XML Database By Xuemei Luo.
Chapter 11 Indexing & Hashing. 2 n Sophisticated database access methods n Basic concerns: access/insertion/deletion time, space overhead n Indexing 
Department of Computer Science and Engineering, HKUST 1 More on Isolation.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search.
Antonin Guttman In Proceedings of the 1984 ACM SIGMOD international conference on Management of data (SIGMOD '84). ACM, New York, NY, USA.
A Survey on Optimistic Concurrency Control CAI Yibo ZHENG Xin
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Database Management Systems, R. Ramakrishnan 1 Algorithms for clustering large datasets in arbitrary metric spaces.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
1 Chapter 12: Indexing and Hashing Indexing Indexing Basic Concepts Basic Concepts Ordered Indices Ordered Indices B+-Tree Index Files B+-Tree Index Files.
11th International Conference on Web-Age Information Management July 15-17, 2010 Jiuzhaigou, China V Locking Protocol for Materialized Aggregate Join Views.
Database Applications (15-415) DBMS Internals- Part XII Lecture 23, April 12, 2016 Mohammad Hammoud.
Spatial Data Management
Mehdi Kargar Department of Computer Science and Engineering
CPS216: Data-intensive Computing Systems
Concurrency Control Managing Hierarchies of Database Elements (18.6)
CS522 Advanced database Systems
Concurrency Control Techniques
Concurrency Control More !
Concurrency Control Part 2
Spatio-Temporal Databases
Indexing and Hashing B.Ramamurthy Chapter 11 2/5/2019 B.Ramamurthy.
Transactions and Concurrency
Concurrency Control E0 261 Prasad Deshpande, Jayant Haritsa
Database Management System
Presentation transcript:

SIGMOD 99 Efficient Concurrency Control in Multidimensional Access Methods Kaushik Chakrabarti Sharad Mehrotra University of Illinois at Urbana Champaign University of California at Irvine Presented at ACM SIGMOD Conference June 1, 1999

SIGMOD 99 Outline of talk Introduction Background Phantom protection in Generalized Search Trees –Define granules –Describe lock protocols Experiments Conclusion

SIGMOD 99 Introduction Increasing number of applications deal with multidimensional data Examples: spatial (CAD, GIS), spatio-temporal (moving objects, weather) DBMS should allow applications to: (1) define their own data types and operations (2) define multidimensional access methods (AMs) for those data types for efficient query processing OR technology solves (1) Generalized Search Trees (GiSTs) addresses (2)

SIGMOD 99 Introduction For successful integration, we need to support concurrent accesses via GiST Concurrency control problems: (1) Preserve consistency of data structure (2) Prevent phantom anomalies (1) has been addressed in Kornacker, Mohan and Hellerstein, SIGMOD97 This paper addresses the problem of phantom protection in GiSTs

SIGMOD 99 Phantom Definition : –T1 reads a set of items satisfying some –T2 creates data items that satisfy T1’s and commits –T1 repeats its scan with the same, gets a different set of items Serializability  No phantoms

SIGMOD 99 Example

SIGMOD 99 Solution Predicate locks: costly Granular locks: efficient

SIGMOD 99 Key Range Locking ARIES/KVL(Mohan, 1990)

SIGMOD 99 Phantoms in Spatial/Spatio-temporal Databases Compute average rainfall over all locations a 2-d region where the locations are indexed using a GiST Get all objects in a given region from a moving objects database where the objects are indexed using a GiST

SIGMOD 99 Solutions Adapting KRL: too costly. Predicate locking based strategy by Kornacker, Mohan and Hellerstein, SIGMOD97. Our granular locking based approach for phantom protection in R-trees, ICDE98. Does not work well when applied to GiSTs (details in paper)

SIGMOD 99 Granular Locking in GiST Solution involves –Define the granules –Define the lock protocol for the operations Challenges –“nice’’ granules –handling overlap among granules –handling “loss of lock’’ problem –high concurrency and low lock overhead

SIGMOD 99 GiST Keys can be arbitrary predicates An AM can be implemented by specifying some extension methods which dictate the tree operations

SIGMOD 99 Granules in GiST Leaf Granules : One per leaf node Non-leaf granules : One per non-leaf node Lock name: Lock Coverage: defined by Granule Predicate (GP) –GP(N) = BP(N) if N is root = BP(N)  GP(P) otherwise, P=parent(N)

SIGMOD 99

SIGMOD 99 Locks

SIGMOD 99 Overlap between granules Correctness: p  p’  lset(p)  lset(p’)  NULL Problem does not arise in KRL Policies –Overlap-for-Search & Cover-for-Insert (OSCI) –Cover-for-Search & Overlap-for-Insert (CSOI)

SIGMOD 99 Loss of lock coverage

SIGMOD 99 Search Protocol Get commit duration S lock on the granule corresponding to each index node visited Correctness: –GP(T)  Q is satisfiable   i  (Consistent(BP(P i ), Q), P i is ancestor of T Note –No object locks –No extra cost except that of acquiring the lock (no extra checks)

SIGMOD 99 Insert Protocol Correctness: –full coverage –prevent phantoms due to loss of lock coverage

SIGMOD 99 Insert Protocol Case 1: No growth, No split –commit duration IX lock on g (target granule) –commit duration X lock on O Case 2: Growth, No split –2 locks as before –short duration IX lock on lowest unchanged node (LU-node)

SIGMOD 99 Example

SIGMOD 99 Insert Protocol Case 3: No growth, Split –instant duration SIX on g –commit duration IX on whichever contains O after split; X on O –instant duration SIX on each ancestor that splits Case 4: Growth, Split –lock requirements of Cases 2 and 3

SIGMOD 99 Deletion Protocol Problem: g does not cover O after deletion  commit duration lock on LU-node We do: –logical deletion (IX on target granule, X on object) –defer physical deletion till transaction commits

SIGMOD 99 Protocol for Other Operations ReadSingle: S lock on object UpdateSingle: –if indexed attributes not changed, IX on g, X on O –else, deletion followed by insertion UpdateScan: same as search for the region, same as updatesingle for every object updated

SIGMOD 99 Empirical Evaluation Data sets: –2-d spatial data: 62,556 2-d points from Sequoia 2000 benchmark –3-d feature data: First 3 Fourier coefficients from 480,471 Fourier vectors

SIGMOD 99 Measurements & Parameters Performance: Throughput (tps) Concurrency: Conflict ratio Overhead: #locks, # pred. Checks Parameters: MPL, transaction size, write probability, query size, external think time (fixed 3sec), restart delay (fixed 3sec)

SIGMOD 99 Implementation

SIGMOD 99 Performance 2-d data3-d data

SIGMOD 99 Performance/Concurrency Under various loadsConflict ratio

SIGMOD 99 Overhead SearchInsert

SIGMOD 99 Conclusions GL is significantly more efficient than PL We expect the performance gap to increase with better implementation (mainly LM) Dimensionality curse is a problem in GL Can be integrated with a consistency protocol for complete solution to concurrency control in multidimensional AMs