Download presentation
Presentation is loading. Please wait.
Published byJordan Ray Modified over 8 years ago
1
6340 DBMS Components
2
DBMS OS, application, middleware Components: storage, query optimizer, recovery manager, transaction processor, security
3
DBMS Storage Linear storage/order storage hierarchy, rand/seq access, disk, block bfr, seek algorithms, buffer, ord (bins) /non-ord files (pile), sorting (int ext) column store, RAID Records: fix/var, unspanned/spanned
4
DBMS indexing PI, SI Single-level, Multilevel Indexing: hashing (reg dynamic, overflow), clustered, B-tree, bitmap, R-tree
5
Query Optimizer Query plan Optimal plan Query trees Order of operators
6
Query Optimization Heuristic: transformation rules Cost-based
7
Physical operators Primitive: block seek/read/write Table scan; table search (linear, binary) Index search; index scan Sort Hashing
8
Cost-based optimization Cost models: I/Os, memory access, CPU, channel transfer; temporary tables Intermediate table sizes; uniform pdf, skewed pdf
9
Estimates Domain cardinality: |pi_A(T)| r(T): |T| R(T): rec size Bfr Statistics pdfs
10
One pass vs Two pass One pass: load smaller table in memory Two pass: block sort in memory, sort on disk, hashing
11
Join reordering |Ti| R=Ti*Tj first, then R*Tk Incorrectly formed Theta-joins; X products Left, right outer joins are not commutative
12
Join algorithms Nested-loop: memory, disk Single-loop=index-based Merge-sort: cheap sort or if sorted; block pairs Hashing: 2 phases, memory, M partitioning, uniform/skewed partitions
13
Relational operators implementation Combining joins with operators Project: dup elimination Set ops: sort-merge, hashing Aggregations: –entire table –Partition: sorting, hasing Outer joins: relational operators or extend join algorithms
14
Pushing relational ops Push select: as early as possible, depending on indexing; nested selects= and’; through union/interesection Push project: with required attributes; not convenient when there are indexes Push aggregation through join: depends on query semantics
15
Concurrency control Interleaved vs parallel processing Ops: r(X),w(X) Lost update; dirty read Atomic statement Binary, shared locks Well-formed transactions
16
Transaction processor Serializability: serial S, conflict, precedence graph P SQL BT implicit; ET explicit Locking vs timestamping Deadlock, starvation Granularity: row, column, block, table
17
Transaction States: active, partially committed, committed, failed, terminated Final operation: commit, rollback ACID properties Isolation levels: read uncomm, read comm, repeatable read, serializable Schedules: –serial, serializable; total ordering vs correctness –result, conflict (<), conflict serializable equivalent (< for some S’)
18
2PL Shared locks; Upgradeable locks 2 phases: no overlap Conservative (lock all), basic, strict (unlock all) Deadlock: wait-die, wound-wait
19
Timestamping Two TS: read_TS(X), write_TS(X) Starvation possible Basic: restart when *_TS(X)>TS(T); else execute T and update *_TS(X) Strict: < & delay T until T’ ends Thomas: T proceeds without writing when TS(T)<write_TS(X)
20
Further concurrency control Multi-version Optimistic
21
Recovery manager Reasons: –Hardware failures: disk, memory, power –Software: unhandled errors, exceptions, transaction abort –Physical events Checkpoint; timeline Cascading rollbacks Redo/undo; idempotent Log: WAL
22
Recovery Manager Schedule Classification Recoverable: T commits only if all T’ commit first and T reads items written by T’ Cascadeless: read only from committed transactions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.