Download presentation
Presentation is loading. Please wait.
1
...Looking back Why use a DBMS? How to design a database? How to query a database? How does a DBMS work?
2
Why do we need to use a DBMS? DBMS vs File Systems DBMS offers a lot of functionality that makes Apps easier to develop –data integrity –query processing algorithms optimization indexes data storage –transactions concurrency atomicity
3
Database Design Conceptual - Logical - Physical models Conceptual: Entity Relationship diagram –great tool for putting design on paper –types of attributes, relationships –designing an E-R diagram for an application –Translating it into Relational Schema
4
Logical Design Relational Databases –efficient, precise way to convey data to the DBMS Normalization –good database design eliminates redundancy –uses decomposition lossless-join dependency preserving Boyce-Codd Normal Form –the “purest” normal form - no redundancy at all 3NF –allows some non-key dependencies PRACTICE: balance decomposition with query joins.
5
Integrity Constraints Domain constraints Key declarations Referential integrity –Emp.DeptName ----> Dept.DeptName Assertions –say that something can not happen Triggers –what to do if something happens
6
Functional Dependencies Relationships between attributes in a relation A -> B means that A determines B –if t.A = s.A, then t.B = s.B (t,s are tuples) Closure of dependencies –all dependencies derivable from the given set –use Armstrongs axioms Closure of attribute sets –simple transitive closure algorithm Canonical cover –eliminate redundant attributes from the dependency Practice: look for dependencies and their effect on updates
7
Query Languages SQL –Intergalactic language in databases –Select.. From.. Where.. Group By.. Having.. Order By.. Relational Algebra –Concise way to express queries –Used inside a DBMS for translation, optimization Practice: know your SQL, it helps.
8
Data Storage Databases mainly rely on Disks and Main Memory Buffer manager used for keeping data in memory Representation of attributes, tuples –fixed-length representation –heap representation (pointers pointing to the data) Tables –Heap (unordered set of tuples) –Sorted on some attribute –Indexed
9
Index Structures ISAM –static multi-level tree structure –gets bad as updates happen B+Trees –dynamic, balanced tree –update algorithm –helps in range, equality selections, joins Hash –use a hash function to determine where a tuple stays –helps only in equality predicate selections and join queries
10
Query Processing Translating SQL query into disk/memory accesses. Key component: Query Optimizer –selecting the best access plan for the query Multiple choices for execution –selection: heap/index –join: join algorithms (nested loops, merge join), heap/index Optimizer uses cost estimates to pick the cheapest plan –what is the best plan for select * from R, S where R.a = S.a?
11
Transactions Atomicity Consistency Isolation Durability Extremely important and complex.
12
Isolation How do we schedule concurrent transactions Serial is good, but slow Serializable is good and fast Check for serializability of a schedule –look for cycles in the precedence graph
13
Durability/Atomicity Shadow database scheme works, but too costly –make a copy of the database before updates Practice: use Commit and Recovery algorithms –keep a Log of what updates we did to the data –Push the Log and the Commit flag to the disk –On crash and recovery, reconstruct the data from the Log
14
Looking Forward... OLAP / Data Warehousing / Decision Support –fast query processing –new operators (data cube etc.) Data Mining Geographical and Multimedia Databases World Wide Web –XML
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.