Transactions, Concluded, and the Future of Data Management Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December.

Slides:



Advertisements
Similar presentations
Cs4432concurrency control1 CS4432: Database Systems II Lecture #21 Concurrency Control : Theory Professor Elke A. Rundensteiner.
Advertisements

Concurrency Control WXES 2103 Database. Content Concurrency Problems Concurrency Control Concurrency Control Approaches.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Cs4432concurrency control1 CS4432: Database Systems II Lecture #23 Concurrency Control Professor Elke A. Rundensteiner.
Cs4432concurrency control1 CS4432: Database Systems II Lecture #22 Concurrency Control: Locking-based Protocols Professor Elke A. Rundensteiner.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Sections
Lock-Based Concurrency Control
(c) Oded Shmueli Transactions Lecture 1: Introduction (Chapter 1, BHG) Modeling DB Systems.
Quick Review of Apr 29 material
Concurrent Transactions Even when there is no “failure,” several transactions can interact to turn a consistent state into an inconsistent state.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
Transaction Management and Concurrency Control
Transactions and Wrap-Up Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 8, 2005 Some slide content derived.
Transaction Management and Concurrency Control
1 Transaction Management Overview Yanlei Diao UMass Amherst March 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Concurrency. Busy, busy, busy... In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it.
Transaction Processing: Concurrency and Serializability 10/4/05.
Transaction Management
Concurrency. Correctness Principle A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
Transactions and Wrap-Up Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 9, 2004 Some slide content derived.
Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Concurrency Control In Dynamic Database Systems Laurel Jones.
CS4432: Database Systems II Transaction Management Motivation 1.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
CSC2012 Database Technology & CSC2513 Database Systems.
08_Transactions_LECTURE2 DBMSs should guarantee ACID properties (Atomicity, Consistency, Isolation, Durability). This is typically done by guaranteeing.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
CS 162 Discussion Section Week 9 11/11 – 11/15. Today’s Section ●Project discussion (5 min) ●Quiz (10 min) ●Lecture Review (20 min) ●Worksheet and Discussion.
V. Megalooikonomou Concurrency control (based on slides by C. Faloutsos at CMU and on notes by Silberchatz,Korth, and Sudarshan) Temple University – CIS.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
1 Concurrency Control II: Locking and Isolation Levels.
Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
7c.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Module 7c: Atomicity Atomic Transactions Log-based Recovery Checkpoints Concurrent.
Degrees of Isolation – A Theoretical Formulation Presented by Balaji Sethuraman.
Transactions. Transaction: Informal Definition A transaction is a piece of code that accesses a shared database such that each transaction accesses shared.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
1 CS542 Concurrency Control: Theory and Protocol Professor Elke A. Rundensteiner.
1 CSE 480: Database Systems Lecture 24: Concurrency Control.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Transactions.
Transaction Management
Transaction Management and Concurrency Control
Transactions B.Ramamurthy Ch.13 11/22/2018 B.Ramamurthy.
CIS 720 Concurrency Control.
Distributed Transactions
Transactions and Wrap-Up
Chapter 10 Transaction Management and Concurrency Control
Lecture 21: Concurrency & Locking
Concurrency Control WXES 2103 Database.
Distributed Transactions
Lecture 22: Intro to Transactions & Logging IV
Transaction Properties: ACID vs. BASE
Temple University – CIS Dept. CIS661 – Principles of Data Management
CPSC-608 Database Systems
C. Faloutsos Transactions
Outline Introduction Background Distributed DBMS Architecture
Presentation transcript:

Transactions, Concluded, and the Future of Data Management Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 4, 2003 Slide content courtesy of Susan Davidson, Raghu Ramakrishnan & Johannes Gehrke

2 Final Administrivia  Project demos today and tomorrow  Final exam handed out at the end of today’s class  Finals plus project reports due by 1PM, 12/18/2003  Project reports should be ballpark pages  Remember, quality and clarity of presentation matters!  Also, me a brief message detailing:  Your contributions to the project  Your group members’ contributions and your assessment of “group dynamics”  Turn in at my office, 576 Levine Hall or to my assistant, Kathy Venit, in 308 Levine Hall

3 Last Time…  We were discussing isolation levels  How to keep transactions from interfering with one another  Or at least, how to minimize this  Recall the strongest version of isolation was serializability

4 Theory of Serializability  A schedule of a set of transactions is a linear ordering of their actions  e.g. for the simultaneous deposits example: R1(X.bal) R2(X.bal) W1(X.bal) W2(X.bal)  A serial schedule is one in which all the steps of each transaction occur consecutively  A serializable schedule is one which is equivalent to some serial schedule (i.e. given any initial state, the final state is the same as one produced by some serial schedule)  The example above is neither serial nor serializable

5 Questions of Concern  Given a schedule S, is it serializable?  How can we "restrict" transactions in progress to guarantee that only serializable schedules are produced?

6 Conflicting Actions  Consider a schedule S in which there are two consecutive actions I i and I j of transactions T i and T j respectively  If I i and I j refer to different data items, then swapping I i and I j does not matter  If I i and I j refer to the same data item Q, then swapping I i and I j matters if and only if one of the actions is a write  Ri(Q) Wj(Q) produces a different final value for Q than Wj(Q) Ri(Q)

7 Testing for Serializability  Given a schedule S, we can construct a di-graph G=(V,E) called a precedence graph  V : all transactions in S  E : T i  T j whenever an action of T i precedes and conflicts with an action of T j in S  Theorem: A schedule S is conflict serializable if and only if its precedence graph contains no cycles  Note that testing for a cycle in a digraph can be done in time O(|V|2)

8 An Example T1 T2 T3 R(X,Y,Z) R(X) W(X) R(Y) W(Y) R(Y) R(X) W(Z) T1 T2 T3 Cyclic: Not serializable.

9 Another Example T1 T2 T3 R(X) W(X) R(X) W(X) R(Y) W(Y) R(Y) W(Y) T1 T2 T3 Acyclic: serializable

10 Producing the Equivalent Serial Schedule  If the precedence graph for a schedule is acyclic, then an equivalent serial schedule can be found by a topological sort of the graph  For the second example, the equivalent serial schedule is:  R1(Y)W1(Y) R2(X)W2(X) R2(Y)W2(Y) R3(X)W3(X)

11 Locking and Serializability  We said that for a serializable schedule, a transaction must hold all locks until it terminates (a condition called strict locking)  It turns out that this is crucial to guarantee serializability  Note that the first (bad) example could have been produced if transactions acquired and immediately released locks.

12 Well-Formed, Two-Phased Transactions  A transaction is well-formed if it acquires at least a shared lock on Q before reading Q or an exclusive lock on Q before writing Q and doesn’t release the lock until the action is performed  Locks are also released by the end of the transaction  A transaction is two-phased if it never acquires a lock after unlocking one  i.e., there are two phases: a growing phase in which the transaction acquires locks, and a shrinking phase in which locks are released

13 Two-Phased Locking Theorem  If all transactions are well-formed and two-phase, then any schedule in which conflicting locks are never granted ensures serializability  i.e., there is a very simple scheduler!  However, if some transaction is not well-formed or two-phase, then there is some schedule in which conflicting locks are never granted but which fails to be serializable  i.e., one bad apple spoils the bunch.

14 Summary of Transactions  Transactions are all-or-nothing units of work guaranteed despite concurrency or failures in the system  Theoretically, the “correct” execution of transactions is serializable (i.e. equivalent to some serial execution)  Practically, this may adversely affect throughput  isolation levels  With isolation levels, users can specify the level of “incorrectness” they are willing to tolerate

15 What to Look for Down the Road  … well, no one really knows the answer to this…  … But here are some hints, ideas, and hot directions  Sensors and streaming data  Peer-to-peer meets databases  “The Semantic Web”  Collaborative data sharing

16 Sensors and Streaming Data  No databases at all…  … Instead we have networks of simple sensors  Madden, starting at MIT  Gehrke, Cornell  Widom, Stanford  queries are in SQL  data is live and “streaming”  we compute aggregates over “windows”

17 What’s Interesting Here  We’re not talking about data on disk – we’re talking about queries over “current readings”  Sensors are generally “stupid” and may be battery-operated  A lot of challenges are networking-related: how to aggregate data before it gets sent, etc.  The next step (e.g., work initiated Penn): including sensors that capture images – a very different problem!  This has many more compelling applications – security, monitoring, correlating multiple sensors, rescue operations, military logistics and coordination, etc.

18 Peer-to-Peer Computing  Fundamentally, our model of DBMSs tends to be centralized  Even for data integration: there’s a single mediator  This has many implications: central administration, central coordination, etc.  What can be gained from borrowing a page from peer-to- peer systems like Napster, Kazaa, etc.?  A better architecture?  Solutions to many problems unsolved by distributed DBMSs?  Replication, object location, distributed optimization, resiliency to failure, …  New types of applications, e.g., in integration?

19 P2P Work  As a new architecture for storage and querying  PIER (Berkeley), P-Grid (EPFL), Medusa (MIT)  A better way of thinking about translating and exchanging data  Piazza (Washington), Orchestra (Penn), Hyperion (Toronto), work at Trento

20 The Semantic Web  In some ways, a very “pie-in-the-sky” vision  But some real and concrete problems might be partly solvable  Goal is really very similar to data integration, where somehow we have mappings between the schemas  Currently, most people in the SW community are from knowledge representation community and use RDF  Focus: very rich ways of describing schemas – “ontologies” – that blend querying with class definitions  “Teachers are people who teach students” “Tenure-track professors are teachers at universities who can get tenure”; etc.  Implicit take on the problem: if we create better languages for describing ontologies, it’s easier to mediate between schemas

21 Holes in the Semantic Web  What issues and concerns came up in the data integration assignment you had?  Do you think a richer schema language would help for these?  Do you think “better normalization” would help?  Fundamentally, we need:  Languages for not only describing relationships, but transformations between formats (e.g., XML schemas)  Automatic or partly automated ways of discovering mappings and correspondences  These are all database problems, and the solution likely must come from the DB community  This is part of what P2P systems like Piazza, Hyperion try to address

22 My Take on the Future  We’ve evolved from a world where data management is about controlling the data  Instead, data management is about translating and transforming data using declarative languages  It should ultimately become much like TCP or SOAP – a set of standard services for “getting stuff” from one point to another, or from one form to another  It’s the plumbing that connects different applications using different formats  Orchestra project at Penn: focuses on how to build a system for supporting collaborative science  People publish and map data in different schemas  What happens if people start updating it?  How do you propagate, manage, trace, reconcile changes?