Presentation is loading. Please wait.

Presentation is loading. Please wait.

System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query.

Similar presentations


Presentation on theme: "System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query."— Presentation transcript:

1 System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query optimization, security, authorization, etc... – integrity constraints v Meta-database – used for DBA, designers, users – accessed frequently by DBMS modules – cp. Data Dictionary u more general: document design process & administration info

2 Query Processing and Optimization v Query processing: – scanner: identifies the language components – parser: checks the query syntax (grammar) – semantic validation – query graph(tree): internal representation of query v Execution Strategy for retrieving data v Query optimization: choose a suitable execution strategy for processing a query

3 Query Optimization v Choosing a reasonably efficient strategy – navigational language vs. high-level query (programmer decide) vs. (DBMS) v Access algorithms – for relational algebra & aggregation and grouping – may be applied to particular storage structure and access paths

4 Query Optimization Techniques v Heuristic rules: u reorder operations in a query tree u recursive query decomposition v Systematic query optimization: – estimation of each strategy u access cost to secondary storage u storage of cost intermediate files u computation: searching, sorting, merging in memory u communication cost – catalog info used in cost functions: u number of records, blocks, blocking factor, number of first index block, selectivity, etc. v Semantic Query Optimization

5 Transaction Processing v Database Transaction: – a logical unit of database processing (work) – an execution of a program that includes database access operations – at data item & disk block access level v Multiprogramming OS for multiusers – interleaved model of concurrent execution v ACID properties of transaction (desired) – Atomicity, consistency, isolation, and durability v Concurrency Control and Recovery Control

6 Need for Concurrency Control v Lost update problem v Temporary update problem v Incorrect Summary (analysis) problem

7 Need for Recovery Control v system crash v transaction or system error v local error or execution error v concurrency control enforcement v disk failure: read-write malfunction v Physical problems and catastrophes: – power failure, fire, etc.

8 ACID Properties v Atomicity – either performed in its entirety or not performed at all v Consistency – correct execution take the database from one consistent state to another v Isolation – should not make its updates visible to other transaction until it is committed (can solve temporary update problem) v Durability – once committed, the changes will not be lost because of subsequent failure

9 Schedules of Transactions v Definition – n transactions are executing concurrently in an interleaved fashion – the order execution of operations from the various transaction forms is called a schedule – The operations of Ti in in a schedule must appear in the same order in which the occur in Ti v Conflicts: – 2 operations belong to 2 transactions accessing the same item, and one of the two operation s is a WRITE op. v Committed Project of a schedule: C(S) – include only the operations in S that belong to committed transactions

10 Serializability Theory v Serial: – for all T in S, all operations of T are executed consecutively – each transaction is independent v Serializable: – a non-serial schedule is (result) equivalent to some serial schedule of the same n transaction v Precedence graph (or serialization graph) – testing of conflict serializability of a schedule – if no cycle in the graph, we can crate an equivalent serial schedule

11 Concurrency Control & Serializability v practically impossible – to determine the operations of a schedule will be interleaved beforehand to ensure serializability v Protocols – followed by every individual transaction or enforced by DBMS concurrency control u Two-phase locking u timestamp ordering u Multiversion u Optimistic: certification or validation – granularity

12 Locking v a variable used for synchronizing the access by concurrent transactions to database item – Binary locks: lock or unlocked – Multi-mode locks: shared (read-locked) vs. exclusive (write-locked) locks v Two-phase Locking Protocol: – all locking operations precede the first unlock operation in the transaction – Expanding (growing) phase --> Shrinking phase v Problems: deadlock, live lock, starvation

13 Timestamp Ordering (TO) v Timestamps: – unique identifier created by DBMS to identify a transaction (transaction start time): ts(T) – read_ts(x) and write_ts(x) with each database item v Basic TO algorithm 1) T issues write: a. if read_ts(X) > ts(T) or write_ts(X) > ts(T), abort T b. set write_ts(x) to ts(T) 2) T issue read: a. if write_ts(x) > ts(T), abort T b. if write_ts(x) < or = ts(T), do read, set read_ts(x) to the larger of ts(T) or read_ts(x) v No deadlock but has cascading rollback

14 Optimistic Concurrency Control v Unlike locking or TO, no checking is done during transaction execution – updates applied to local copies, at the end of transaction execution, a validation phase checks whether any of the transaction updates violates serializability – read phase -> validation phase --> write phase v assume little interference v timestamps on write_set and read_set

15 Transaction States & Operations v For recovery purpose, transaction states need to be recorded in a system log v transactions states – BEGIN_TRANSACTION – READ OR WRITE – END_TRANSACTION – COMMIT_TRANSACTION – ROLLBAK (OR ABORT) vs. UNDO u one transaction vs. one operation – REDO: redo certain operations to make sure

16 Recovery Techniques v Deferred update (after) – NO-UNDO/REDO algorithm v Immediate update: (before) – UNDO/REDO or UNDO/NO-REDO algorithm v in-place updating vs. shadowing – write-ahead logging (WAL) for in-place updating – before image (BFIM) + after image (AFIM) for shadow paging v O.S.: buffering and caching -> DBMS cache v Two-phase commit protocol for Multidatabase, – phase 1: prepare to commit, ready to commit – phase 2: all O.K., “commit” v Database Backup: periodical vs. incremental

17 Commit point vs. Checkpoint v Commit point: – all operations in a transaction have been successfully executed and recorded in a system log – force-write log file (to disk) v Checkpoints – a checkpoint record is written into the log periodically at that point when the system writes out to the database on disk the effect of all WRITE operations of committed transactions – recovery manager decides at what intervals to take a check point in minutes or number of committed transaction – checkpoint record can contain information such as list of active transaction_id, location of active transactions, etc.

18 Recoverability v A schedule S is said to be recoverable if no transaction T in S commits until all transactions T’ that have written an item that T reads have committed


Download ppt "System Catalogue v Stores data that describes each database v meta-data: – conceptual, logical, physical schema – mapping between schemata – info for query."

Similar presentations


Ads by Google