A New Method for Concurrency Control in Centralized Database Systems Victor T.S. Shi and William Perrizo Computer Science, North Dakota State University.

Slides:

Advertisements

Similar presentations

1 Scaleable Replicated Databases Jim Gray (Microsoft) Pat Helland (Microsoft) Dennis Shasha (Columbia) Pat ONeil (U.Mass)

Advertisements

Multiple Processor Systems

Wireless network Usually use Radio Frequency (RF) technology Adv :

Dissemination-based Data Delivery Using Broadcast Disks.

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.

1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.

Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.

High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.

Multiple Processor Systems

Concurrency Control Nate Nystrom CS 632 February 6, 2001.

MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )

Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.

Distributed Database Management Systems

Transactions – T4.3 Title: Concurrency Control Performance Modeling: Alternatives and Implications Authors: R. Agarwal, M. J. Carey, M. Livny ACM TODS,

1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.

Reliability and Partition Types of Failures 1.Node failure 2.Communication line of failure 3.Loss of a message (or transaction) 4.Network partition 5.Any.

Copyright © 1998 Wanda Kunkle Computer Organization 1 Chapter 2.1 Introduction.

1 PERFORMANCE EVALUATION H Often in Computer Science you need to: – demonstrate that a new concept, technique, or algorithm is feasible –demonstrate that.

Chapter 12 Distributed Database Management Systems

McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,

OS Fall ’ 02 Performance Evaluation Operating Systems Fall 2002.

©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.

Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.

Distributed Databases

MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.

Department of Computer Science Southern Illinois University Edwardsville Dr. Hiroshi Fujinoki and Kiran Gollamudi {hfujino,

Chapter 4.  Understand network connectivity.  Peer-to-Peer network & Client-Server network  Understand network topology  Star, Bus & Ring topology.

Ekrem Kocaguneli 11/29/2010. Introduction CLISSPE and its background Application to be Modeled Steps of the Model Assessment of Performance Interpretation.

Shilpa Seth.  Centralized System Centralized System  Client Server System Client Server System  Parallel System Parallel System.

AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.

Concurrency Control in Distributed Databases. By :- Rishikesh Mandvikar rmandvik[at]engr.smu.edu May 1, 2004.

Replication and Consistency. Reference The Dangers of Replication and a Solution, Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. In Proceedings.

04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.

1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.

10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.

Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.

Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.

Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.

Distributed Database Systems Overview

A User-Lever Concurrency Manager Hongsheng Lu & Kai Xiao.

By Phani Gowthami Tammineni. Overview This presentation is about the issues in real-time database systems and presents an overview of the state of the.

OS2- Sem ; R. Jalili Introduction Chapter 1.

Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.

Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.

EEE440 Computer Architecture

Databases Illuminated

Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.

A Survey on Optimistic Concurrency Control CAI Yibo ZHENG Xin

Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.

Chapter 15: Transactions Loc Hoang CS 157B. Definition n A transaction is a discrete unit of work that must be completely processed or not processed at.

Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.

Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  What Operating Systems Do  Computer-System Organization  Computer-System Architecture  Operating-System Structure.

Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.

Understanding Performance Testing Basics by Adnan Khan.

Distributed Systems and Security: An Introduction Brad Karp and Steve Hailes UCL Computer Science CS Z03 / nd October, 2006.

Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.

On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)

Basic Concepts Maximum CPU utilization obtained with multiprogramming

OPERATING SYSTEMS CS 3502 Fall 2017

CPU Scheduling CSSE 332 Operating Systems

Definition of Distributed System

Advanced Operating Systems

EECS 498 Introduction to Distributed Systems Fall 2017

Predictive Performance

Transaction Management

Introduction of Week 14 Return assignment 12-1

Database System Architectures

Concurrency control (OCC and MVCC)

Distributed Optimistic Algorithm

Presentation transcript:

A New Method for Concurrency Control in Centralized Database Systems Victor T.S. Shi and William Perrizo Computer Science, North Dakota State University Fargo, ND 58105, USA (Patents are pending on the ROCC technology describe herein)

The Problem Scale-up: Buying a bigger and faster machine (2X TPS) Replication: Placing data at two machines and keep the data current 100 users Base case: a 1X TPS system 1X TPS server 2X TPS server 1X TPS server 100 users 2X TPS server Partitioning: dividing the work between two machines Doubling the number of users increases total workload by a factor of four, causing an 8 fold increase of deadlocks (2PL) or restarts (OCC). J. Gray, P. Helland, P. O’Neil, D. Shasha, “Danger of replication and a solution”, ACM SIGMOD, 1996, pp

Our Experimental Platform Switch DB node 1 2 N Local distributed database system 16 PC boxes providing 64 ×16=1024 GB data storage, connected by a gigabit switch (WUGS).

Problems to Solve  Expected situation: High system throughput, short and predictable transaction response time.  Traditional method used in most commercial products for concurrency control is two phase locking. It no longer can be assumed optimal for a high performance system due to potential thrashing behavior and problems in distributed environments with uncertain network latency. [1] R. Agrawal, M. Carey and M Livny, “Concurrency control performance modeling: alternatives and implications”, ACM Transaction On Database system, Vol.12, No.4, 1987, pp [4] P. Franaszek, J. Robinson and A. Thomasian, “Concurrency control for high performance environments”, ACM Transactions on Database Systems, Vol. 17, No.2, 1992, pp

Our Solution: Read-commit Order Concurrency Control (ROCC)  A transaction may send multiple access request messages, each one containing one or more access operations.  The switch “intercepts” the request messages as they flow through it.  When a new request message arrives, an “element” will be generated containing the identifier of the transaction and the data items to be accessed.  The element is posted to a Read-Commit queue (RCqueue) maintained in the switch.  When a commit request arrives, the switch will perform validation based on the RC queue, to decide whether or not the transaction can commit.  Database sites access the requested data items based on FCFS (first come first served) discipline.  Writes are delayed to avoid cascading abort.

ROCC: Procedures Switch DB node 1 2 N Local distributed database system SPC Client-side procedure Client-side procedure Active-Networking-switch-side procedure Active-Networking-switch-side procedure Database-side procedure Database-side procedure

Techniques Used in ROCC  We use an OCC mechanism, thus a restart-based approach (no deadlocks).  We limit restarts of a transaction with pessimistic “ over-declaration” after n restarts (no livelock).  Bit-vector-oriented hardware validation (bit vector table) relieves the CPU burden.  Writes are delayed (organized into one commit element) to reduce the chances of validation failure (writes have higher probability of conflict with other operations).  If all intervening elements are Read elements, no validation is needed. (since writes are delayed, there is no chance of two or more elements conflicting)

Read-Commit Queue (RC queue) Tid V C R R/W Next The element format: T Writes Next T Reads Next T Reads Next T Reads Next T Reads/Writes Next NULL Example of RC Transaction ID Validated Commit Restart Read/Write Read element Commit element Restart element Validated element

RC Element Definitions Read element Represents the request message a transaction submits, contains the identifiers of data items it requests to read. All the write requests are delayed until commit; thus they only appear in Commit elements Read element Represents the request message a transaction submits, contains the identifiers of data items it requests to read. All the write requests are delayed until commit; thus they only appear in Commit elements Commit element Represents a commit request message of the corresponding transaction. It contains all the identifiers of data items that the transaction requests to write. Commit element Represents a commit request message of the corresponding transaction. It contains all the identifiers of data items that the transaction requests to write. Restart element The active switch generates a Restart element when the validation of a transaction fails. The Restart element contains all the identifiers of data items that the failed transaction intends to read and write Restart element The active switch generates a Restart element when the validation of a transaction fails. The Restart element contains all the identifiers of data items that the failed transaction intends to read and write Validated element The Validated element corresponds to transaction that has validated, or transaction that doesn’t need validation (static or restarted transaction) Validated element The Validated element corresponds to transaction that has validated, or transaction that doesn’t need validation (static or restarted transaction)

Functional Specification of SPC Dispatcher VP0 Receive Queue Transmit Queue RC0RC1RCn RC Queue VP1 VPn From clients To database servers Smart Port Card

Smart Port Card: Active Processing Unit port 2 port 1 Experimental FPGA Main Memory Cache CPU Intel Embedded Module System FPGA APIC PCI Bus

The Features of ROCC  High system throughput (due to optimistic nature)  Short, predictable and controllable transaction response time (desired feature in real time systems)  Reduced restarts (a transaction restarts only when two or more elements conflict with intervening elements)  Reduced validation complexity (no validations needed for static and pessimistic restarted transactions – only intervening conflict checked).  Fast hardware-level validation (bit vector oriented hardware design).  Deadlock and Livelock free.

The Features of ROCC  Developed ROCC due to dramatic increase of deadlocks in 2PL and repeated restarts in OCC (J. Gray, P. Helland, P. O’Neil, D. Shasha, “The danger of replication and a solution”, proceedings of the ACM SIGMOD conference, 1996, pp )  ROCC takes the advantages of physical star or tree topology of our LAN and active processing concepts for performance improvements, though the star or tree topology is not necessary.  The SPC cards and the WUGS switch provide a nice environment to test ROCC in a combination of software and hardware levels.  ROCC Homepage: Contains simulation results which can be verified with the software on this page (simulator can be downloaded).

The Performance of ROCC We compared ROCC with 2PL, OCC and WDL (Wait Depth Limit). The simulation results are shown in next slides. The parameter settings are as follows. We assume a client may send multiple access requests to the data server. The average intra-transaction think time is 1 second. The database size is 1000 pages. The Transaction size varies: 4-6 pages for low data contention environments pages for high data contention environments. Disk I/O is 35ms and CPU processing time per page is 10ms. The transaction throughput shown is defined as the number of transactions completed per second. The restart ratio is defined as the average number of times a transaction restarts per commit. 2PL was run just enough to determine the thrashing point (very long simulation times required)

ROCC Throughput (transaction size=4-6 pages accessed – low contention)

ROCC Throughput (transaction size=10-16 pages accessed – high contention)

ROCC: Restarts (transaction size=4-6 pages accessed)

ROCC: Restarts (transaction size pages accessed)