CS 405G: Introduction to Database Systems 20. Concurrent control, Storage.

Slides:



Advertisements
Similar presentations
Storing Data: Disk Organization and I/O
Advertisements

Storing Data: Disks and Files
Database Systems (資料庫系統)
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Sections
ICS 421 Spring 2010 Transactions & Concurrency Control (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book.
Concurrency Control R&G - Chapter 17 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book.
1 Storing Data: Disks and Files Yanlei Diao UMass Amherst Feb 15, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Concurrency Control. 2 Locking: A Technique for C. C. Concurrency control usually done via locking. Lock info maintained by a “lock manager”: –Stores.
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,
©Silberschatz, Korth and Sudarshan16.1Database System Concepts 3 rd Edition Chapter 16: Concurrency Control Lock-Based Protocols Timestamp-Based Protocols.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Secondary Storage Management Hank Levy. 8/7/20152 Secondary Storage • Secondary Storage is usually: –anything outside of “primary memory” –storage that.
Introduction to Database Systems 1 The Storage Hierarchy and Magnetic Disks Storage Technology: Topic 1.
Storing Data: Disks & Files
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 9.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#21: Concurrency Control (R&G ch. 17)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7.
Physical Storage Susan B. Davidson University of Pennsylvania CIS330 – Database Management Systems November 20, 2007.
Introduction to Database Systems 1 Storing Data: Disks and Files Chapter 3 “Yea, from the table of my memory I’ll wipe away all trivial fond records.”
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Chapter 7 “ Yea, from the table of my memory I ’ ll wipe away.
1 Storing Data: Disks and Files Chapter 9. 2 Disks and Files  DBMS stores information on (“hard”) disks.  This has major implications for DBMS design!
“Yea, from the table of my memory I’ll wipe away all trivial fond records.” -- Shakespeare, Hamlet.
Database Management Systems,Shri Prasad Sawant. 1 Storing Data: Disks and Files Unit 1 Mr.Prasad Sawant.
External Storage Primary Storage : Main Memory (RAM). Secondary Storage: Peripheral Devices –Disk Drives –Tape Drives Secondary storage is CHEAP. Secondary.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
1 Concurrency Control II: Locking and Isolation Levels.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Instructor: Xintao Wu.
Chapter Ten. Storage Categories Storage medium is required to store information/data Primary memory can be accessed by the CPU directly Fast, expensive.
CS 405G: Introduction to Database Systems 25 Exercise Chen Qian University of Kentucky.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Storing Data: Disks and Files Content based on Chapter 9 Database Management Systems, (3.
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
1 Database Systems ( 資料庫系統 ) December 27, 2004 Chapter 17 By Hao-hua Chu ( 朱浩華 )
CS 405G: Introduction to Database Systems Storage.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Storing Data: Disks and Files Chapter 7 Jianping Fan Dept of Computer Science UNC-Charlotte.
CS 405G: Introduction to Database Systems 13b Exercise Chen Qian University of Kentucky.
1 Storing Data: Disks and Files Chapter 9. 2 Objectives  Memory hierarchy in computer systems  Characteristics of disks and tapes  RAID storage systems.
Database Systems (資料庫系統)
Storing Data: Disks and Files
Database Management Systems (CS 564)
Database Management Systems (CS 564)
Concurrency Control.
Transaction Management
Storing Data: Disks and Files
Transaction Management
Concurrency Control Chapter 17
CS162 Operating Systems and Systems Programming Review (II)
Chapter 15 : Concurrency Control
Secondary Storage Management Brian Bershad
Concurrency Control Chapter 17
Basics Storing Data on Disks and Files
Concurrency Control R&G - Chapter 17
Transaction Management
Secondary Storage Management Hank Levy
CS 505: Intermediate Topics to Database Systems
EECS 647: Introduction to Database Systems
CS 405G: Introduction to Database Systems
Database Systems (資料庫系統)
Database Systems (資料庫系統)
CS 405G: Introduction to Database Systems
Presentation transcript:

CS 405G: Introduction to Database Systems 20. Concurrent control, Storage

10/25/2015Chen University of Kentucky2 Conflicting operations Two operations on the same data item conflict if at least one of the operations is a write r(X) and w(X) conflict w(X) and r(X) conflict w(X) and w(X) conflict r(X) and r(X) do not r/w(X) and r/w(Y) do not Order of conflicting operations matters E.g., if T 1.r(A) precedes T 2.w(A), then conceptually, T 1 should precede T 2

10/25/2015Chen University of Kentucky3 Locking Rules If a transaction wants to read an object, it must first request a shared lock (S mode) on that object If a transaction wants to modify an object, it must first request an exclusive lock (X mode) on that object Allow one exclusive lock, or multiple shared locks Mode of lock(s) currently held by other transactions Mode of the lock requested Grant the lock? Compatibility matrix SX SYN XNN

10/25/2015Chen University of Kentucky4 Basic locking is not enough T 1 T 2 r(A) w(A) r(A) w(A) r(B) w(B) r(B) w(B) lock-X(A) lock-X(B) unlock(B) unlock(A) lock-X(A) unlock(A) unlock(B) lock-X(B) Possible schedule under locking But still not conflict-serializable! T1T1 T2T2 Read 100 Write Read 101 Write 101*2 Read 100 Write 100*2 Read 200 Write Add 1 to both A and B (preserve A=B) Multiply both A and B by 2 (preserves A=B) A  B!A  B!

10/25/2015Chen University of Kentucky5 Two-phase locking (2PL) All lock requests precede all unlock requests Phase 1: obtain locks, phase 2: release locks T 1 T 2 r(A) w(A) r(A) w(A) r(B) w(B) r(B) w(B) lock-X(A) lock-X(B) unlock(B) unlock(A) lock-X(A) lock-X(B) Cannot obtain the lock on B until T 1 unlocks T 1 T 2 r(A) w(A) r(A) w(A) r(B) w(B) r(B) w(B) 2PL guarantees a conflict-serializable schedule

10/25/2015Chen University of Kentucky6 Problem of 2PL T 2 has read uncommitted data written by T 1 If T 1 aborts, then T 2 must abort as well Cascading aborts possible if other transactions have read data written by T 2 Even worse, what if T 2 commits before T 1 ? Schedule is not recoverable if the system crashes right after T 2 commits T 1 T 2 r(A) w(A) r(A) w(A) r(B) w(B) r(B) w(B) Abort!

10/25/2015Chen University of Kentucky7 Strict 2PL Only release locks at commit/abort time A writer will block all other readers until the writer commits or aborts  Used in most commercial DBMS

Deadlock Detection Create and maintain a “waits-for” graph Periodically check for cycles in graph 10/25/2015Chen University of Kentucky8

Deadlock Detection (Continued) Example: T1: S(A), S(D), S(B) T2: X(B) X(C) T3: S(D), S(C), X(A) T4: X(B) T1T2 T4T3 Deadlock! 10/25/2015Chen University of Kentucky9

Deadlock Prevention Assign priorities based on timestamps. Say Ti wants a lock that Tj holds Two policies are possible: Wait-Die: If Ti has higher priority, Ti waits for Tj; otherwise Ti aborts Wound-wait: If Ti has higher priority, Tj aborts; otherwise Ti waits 10/25/2015Chen University of Kentucky10

10/25/2015Chen University of Kentucky11 Storage It’s all about disks! That’s why we always draw databases as And why the single most important metric in database processing is the number of disk I/Os performed

The Storage Hierarchy Source: Operating Systems Concepts 5th Edition Main memory (RAM) for currently used data Disk for the main database (secondary storage). Tapes for archiving older versions of the data (tertiary storage). Smaller, Faster Bigger, Slower 10/25/201512Chen University of Kentucky

10/25/2015Chen University of Kentucky13 Jim Gray’s Storage Latency Analogy: How Far Away is the Data?

10/25/2015Chen University of Kentucky14 A typical disk Spindle rotation Spindle Platter Tracks Arm movement Disk arm Disk head Cylinders “Moving parts” are slow

10/25/2015Chen University of Kentucky15 Disk access time Sum of: Seek time: time for disk heads to move to the correct cylinder Rotational delay: time for the desired block to rotate under the disk head Transfer time: time to read/write data in the block (= time for disk to rotate over the block)

10/25/2015Chen University of Kentucky16 Random disk access Seek time + rotational delay + transfer time Average seek time Time to skip one half of the cylinders? Not quite; should be time to skip a third of them (why?) “Typical” value: 5 ms Average rotational delay Time for a half rotation (a function of RPM) “Typical” value: 4.2 ms (7200 RPM) Typical transfer time.08msec per 8K block

10/25/2015Chen University of Kentucky17 Sequential Disk Access Improves Performance Seek time + rotational delay + transfer time Seek time 0 (assuming data is on the same track) Rotational delay 0 (assuming data is in the next block on the track) Easily an order of magnitude faster than random disk access!

10/25/2015Chen University of Kentucky18 Performance tricks Disk layout strategy Keep related things (what are they?) close together: same sector/block ! same track ! same cylinder ! adjacent cylinder Double buffering While processing the current block in memory, prefetch the next block from disk (overlap I/O with processing) Disk scheduling algorithm Track buffer Read/write one entire track at a time Parallel I/O More disk heads working at the same time

10/25/2015Chen University of Kentucky19 Records and Files A row in a table, when located on disks, is called A record Two types of record: Fixed-length Variable-length

10/25/2015Chen University of Kentucky20 In an abstract sense, a file is A set of “records” on a disk In reality, a file is A set of disk pages Each record lives on A page Physical Record ID (RID) A tuple of

Files Higher levels of DBMS operate on records, and files of records. FILE: A collection of pages, containing a collection of records. Record = row in a table Must support: insert/delete/modify record fetch a particular record (specified using record id) scan all records (possibly with some conditions on the records to be retrieved) 10/25/201521Chen University of Kentucky

Unordered (Heap) Files Simplest file structure contains records in no particular order. As file grows and shrinks, disk pages are allocated and de-allocated. To support record level operations, we must: keep track of the pages in a file keep track of free space on pages keep track of the records on a page There are many alternatives for keeping track of this. We’ll consider 2 10/25/201522Chen University of Kentucky

Heap File Implemented as a List The header page id and Heap file name must be stored someplace. Database “catalog” Each page contains 2 `pointers’ plus data. Header Page Data Page Data Page Data Page Data Page Data Page Data Page Pages with Free Space Full Pages 10/25/201523Chen University of Kentucky

Heap File Using a Page Directory DBMS must remember where the first directory page is located. The entry for a page can include the number of free bytes on the page. The directory itself is a collection of pages and shown as a linked list. Much smaller than linked list of all HF pages! Data Page 1 Data Page 2 Data Page N Header Page DIRECTORY 10/25/201524Chen University of Kentucky

Trade-off Compared to linked list, the directory based approaches have: Pros: Fast access to a particular record Cons: Extra space 10/25/2015Chen University of Kentucky25

Summary Disks provide cheap, non-volatile storage. Random access, but cost depends on the location of page on disk; important to arrange data sequentially to minimize seek and rotation delays. 10/25/201526Chen University of Kentucky