Advanced Topics: Indexes & Transactions

Slides:



Advertisements
Similar presentations
Transactions - Concurrent access & System failures - Properties of Transactions - Isolation Levels 4/13/2015Databases21.
Advertisements

1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management and Concurrency Control
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Dec 15, 2003Murali Mani Transactions and Security B term 2004: lecture 17.
Cs3431 Transactions, Logging and Security. cs3431 Transactions: What and Why? A set of operations on a database must appear as one “unit”. Example: Consider.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
Final-Exam Revision Instructor: Mohamed Eltabakh 1.
Physical DB Issues, Indexes, Query Optimisation Database Systems Lecture 13 Natasha Alechina.
1cs Intersection of Concurrent Accesses A fundamental property of Web sites: Concurrent accesses by multiple users Concurrent accesses intersect.
Transaction processing Book, chapter 6.6. Problem: With a single user…. you run a query, you get the results, you run the next, etc. But database life.
DBMS Implementation Chapter 6.4 V3.0 Napier University Dr Gordon Russell.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 136 Database Systems I SQL Modifications and Transactions.
Giovanni Chierico | May 2012 | Дубна Data Concurrency, Consistency and Integrity.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Session 1 Module 1: Introduction to Data Integrity
1 Intro stored procedures Declaring parameters Using in a sproc Intro to transactions Concurrency control & recovery States of transactions Desirable.
1 Advanced Database Concepts Transaction Management and Concurrency Control.
10 1 Chapter 10 - A Transaction Management Database Systems: Design, Implementation, and Management, Rob and Coronel.
CS4432: Database Systems II
In this session, you will learn to: Implement triggers Implement transactions Objectives.
SQL Basics Review Reviewing what we’ve learned so far…….
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
10/3/2017 Chapter 6 Index Structures.
Databases and DBMSs Todd S. Bacastow January
Chapter # 14 Indexing Structures for Files
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
CS 540 Database Management Systems
Indexing Goals: Store large files Support multiple search keys
Indexing and hashing.
Storage and Indexes Chapter 8 & 9
Transaction Management and Concurrency Control
Lecture 20: Indexing Structures
Translation of ER-diagram into Relational Schema
Transactions.
Transactions Isolation Levels.
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
SQL Views CS542.
Transactions Properties.
File organization and Indexing
Batches, Transactions, & Errors
SQL Views and Updates cs3431.
March 9th – Transactions
Transactions, Locking and Query Optimisation
Transactions Sylvia Huang CS 157B.
Advanced SQL: Views & Triggers
Chapter 10 Transaction Management and Concurrency Control
The PROCESS of Queries John Deardurff
SQL: Structured Query Language
Instructor: Mohamed Eltabakh
Understanding Transaction Isolation Levels
Transactions Isolation Levels.
Database Management System
Batches, Transactions, & Errors
Lecture 13: Transactions in SQL
Lecture 20: Intro to Transactions & Logging II
Transactions and Concurrency
SQL: Structured Query Language
Lesson Objectives Aims You should know about: 1.3.2: (a) indexing (d) SQL – Interpret and Modify (e) Referential integrity (f) Transaction processing,
Database Applications (15-415) DBMS Internals- Part XIII Lecture 24, April 14, 2016 Mohammad Hammoud.
Lecture 20: Indexes Monday, February 27, 2006.
UNIT -IV Transaction.
CSC 453 Database Systems Lecture
-Transactions in SQL -Constraints and Triggers
Instructor: Mohamed Eltabakh
Lecture 11: Transactions in SQL
Presentation transcript:

Advanced Topics: Indexes & Transactions Instructor: Mohamed Eltabakh meltabakh@cs.wpi.edu cs3431

Indexes cs3431

Why Indexes With or without indexes, the query answer should be the same Indexes are needed for efficiency and fast access of data Without index, we check all 10,000 students SELECT * FROM Student WHERE sNumber = 76544357; Assume we have 10,000 students With index, we can reach that student directly cs3431

Direct Access vs. Sequential Access SELECT * FROM Student WHERE sNumber = 76544357; Without index, we check all 10,000 students (sequential access) With index, we can reach that student directly (direct access) cs3431

What is an Index Student A index is an auxiliary file that makes it more efficient to search for a record in the data file The index is usually specified on one field of the file Although it could be specified on several fields The index is stored separately from the base table Each table may have multiple indexes Student Can create an index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 3 Matt Can create a second index on sName cs3431

Example: Index on sNumber Student Index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100 Index file is always sorted Index size is much smaller than the table size Now any query (equality or range) on sNumber can be efficiently answered (Binary search on the index)

Example: Index on sName Student Index on sName sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 Dave Greg John Matt Duplicates values have duplicate entries in the index Now any query (equality or range) on sName can be efficiently answered (Binary search on the index)

Creating an Index Student Create Index <name> On <tablename>(<colNames>); Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 DB System knows how to: 1- create the index 2- when and how to use it Create Index sNumberIndex On Student(sNumber); Create Index sNameIndex On Student(SName);

Multiple Predicates Student 1- The best the DBMS can do is using addressIndex  ‘320FL’ 2- From those tuples, check sName = ‘Dave’ Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 50WA … 4 John .. 3 200LA SELECT * FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; Create Index addessIndex On Student(address); cs3431

Multi-Column Indexes Columns X, Y are frequently queried together (with AND) Each column has many duplicates Then, consider creating a multi-column index on X, Y SELECT * FROM Student WHERE address = ‘320FL’ AND sName = ‘Dave’; sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 50WA … 4 John .. 3 200LA Directly returns this record only Create Index nameAdd On Student(sName, address);

Using an Index DBMS automatically figures out which index to use based on the query SELECT * FROM Student WHERE sNumber = 76544357; Student sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 Automatically uses SNumberIndex Create Index sNumberIndex On Student(sNumber); Create Index sNameIndex On Student(SName); cs3431

How Do Indexes Work? cs3431

Types of Indexes Primary vs. Secondary Single-Level vs. Multi-Level (Tree Structure) Clustered vs. Non-Clustered cs3431

Primary vs. Secondary Indexes Index on the primary key of a relation is called primary index (only one) Index on any other column is called secondary index (can be many) In primary index, all values are unique In secondary indexes, values may have duplicates Student Index on SSN is a Primary Index SSN sNumber sName address pNum 11111 1 Dave 320FL 22222 2 Greg 33333 100 Matt 44444 10 … 55555 4 John .. 66666 3 Index on sNumber is a Secondary Index Index on sName is a Secondary Index

Single-Level Indexes Student Index is one-level sorted list Given a value v to query Perform a binary search in the index to find it (Fast) Follow the link to reach the actual record Student Index on sNumber sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100

Multi-Level Index Student Build index on top of the index (can go multiple levels) When searching for value v: Find the largest entry ≤ v, and follow its pointer Student 2nd level sNumber sName address pNum 1 Dave 320FL 2 Greg 100 Matt 10 … 4 John .. 3 1 2 3 4 10 100 1st level 1 4 cs3431 Index on sNumber

Clustered vs. Non-Clustered Assume there is index X on column C If the records in the table are stored sorted based on C X  Clustered index Otherwise, X  Non-Clustered index Primary index is a clustered index Student SSN sNumber sName address 11111 1 Dave 320FL 22222 2 Greg 33333 100 Matt 44444 10 … 55555 4 John 66666 3 11111 22222 33333 44444 55555 66666 1 2 3 4 10 100 Non-Clustered index Clustered index

Index Maintenance Indexes are used in queries But, need to be maintained when data change Insert, update, delete DBMS automatically handles the index maintenance When insert new records  the indexed field is added to the index When delete records  their values are deleted from the index When update an indexed value  delete the old value from index & insert the new value There is a cost for maintaining an index, however its benefit is usually more (if used a lot) cs3431

Summary of Indexes Indexes are auxiliary structures for efficient searching and querying Query answer is the same with or without index What to index depends on which columns are frequently queried (in Where clause) Main operations Create Index <name> On <tablename>(<colNames>); Drop Index <name>; cs3431

Transactions cs3431

Transactions solve these problems What is a Transaction A set of operations on a database that are treated as one unit Execute All or None Transactions have semantics at the application level Want to reserve two seats in a flight Transfer money from account A to account B … What if two users are reserving the same flight seat at the same time??? Transactions solve these problems

Transactions By default, each SQL statement is a transaction Can change the default behavior SQL > Start transaction; SQL > Insert …. SQL > Update … SQL > Delete .. SQL > Select … SQL> Commit | Rollback; All of these statements are now one unit (either all succeed all fail) End transaction successfully Cancel the transaction

Transaction Properties Four main properties Atomicity – A transaction if one atomic unit Consistency – A transaction ensures DB is consistent Isolation – A transaction is considered as if no other transaction was executing simultaneously Durability – Changes made by a transaction must persist ACID: Atomicity, Consistency, Isolation, Durability ACID properties are enforced by the DBMS cs3431

What is the right answer??? Wrong, Inconsistent data Consistency Issue Many users may update the data at the same time How to ensure the result is consistent x 2 3 4 10 100 2 1 Update T Set x = x * 3; Update T Set x = x + 2; 3 What is the right answer??? x 12 15 14 32 302 Wrong, Inconsistent data

Serial Order of Transactions Given N concurrent transactions T1, T2, …TN Serial order is any permutation of these transactions (N!) T1, T2, T3, …TN T2, T3, T1, …, TN … DBMS will ensure that the end-result from executing the N transactions (concurrently) matches one of the serial order execution That is called Serializability As if transactions are executed in serial order cs3431

Serializable Execution Given N concurrent transactions T1, T2, …TN DBMS will execute them concurrently (at the same time) But, the final effect matches one of the serial order executions x 2 3 4 10 100 Update T Set x = x * 3; Update T Set x = x + 2; x 12 15 18 36 306 x 8 11 14 32 302

That is the default in DBMS Isolation Levels Read Uncommitted Read Committed Repeatable Read Serializable Gets stronger & avoids problems That is the default in DBMS cs3431

1- READ UNCOMMITTED NonRepeatable read (bad) Dirty read (bad) Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- blue -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | V Time NonRepeatable read (bad) Dirty read (bad)

2- READ COMMITTED Dirty Read  Solved NonRepeatable read (bad) Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- blue -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | V Time NonRepeatable read (bad)

2- READ COMMITTED Phantom (bad) Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- delete cust where id=500; -----------COMMIT------------ | V Time Phantom (bad)

NonRepeatable Read Solved Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- update cust set color='blue' where id=500; -----------COMMIT------------ | V Time

Phantom (For Delete)  Solved 3- REPEATABLE READ Phantom (For Delete)  Solved Session 2 -------BEGIN TRANSACTION----- select color from cust where id=500; color ------ red select color from cust ----- -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- delete cust where id=500; -----------COMMIT------------ | V Time

3- REPEATABLE READ Phantom Insert (bad) Session 2 -------BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust 500 -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); -----------COMMIT------------ | V Time Phantom Insert (bad)

4- SERIALIZABLE Phantom  Solved Session 2 -------BEGIN TRANSACTION----- select id from cust where color=‘blue’; id -- select id from cust -----------COMMIT------------ Session 1 -------BEGIN TRANSACTION----- Insert into cust(id, color) values (500, ‘blue’); -----------COMMIT------------ | V Time

Summary of Transactions Unit of work in DBMS Either executed All or None Ensures consistency among many concurrent transactions Ensures persistent data once committed (using recovery techniques) Main ACID properties Atomicity, Consistency, Isolation, Durability cs3431

END !!! cs3431

Friday’s Lecture (Revision + short Quiz) Final Exam Dec. 13, at 8:15am – 9:30am (75 mins) Closed book, open sheet Answer in the same exam sheet Material Included ERD SQL (Select, Insert, Update, Delete) Views, Triggers, Assertions Cursors, Stored Procedures/Functions Material Excluded Relational Model & Algebra Normalization Theory ODBC/JDBC Indexes and Transactions Friday’s Lecture (Revision + short Quiz)