Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS405G: Introduction to Database Systems

Similar presentations


Presentation on theme: "CS405G: Introduction to Database Systems"— Presentation transcript:

1 CS405G: Introduction to Database Systems

2 Announcement Today Review Friday Go over homework Course evaluation

3 Materials Review for final Book
Slides ( should be all on the course website) Homework Quizzes Mid-Term

4 Jinze Liu @ University of Kentucky
Database Design 11/16/2018 Jinze University of Kentucky

5 E-R model E-R model Entities Attributes Relationships

6 Jinze Liu @ University of Kentucky
11/16/2018 Jinze University of Kentucky

7 Database Design 11/16/2018 11/16/2018 7 7 7

8 From E-R Diagram to Relations
Schemas Converting E-R diagram to relations Keys Super keys Candidate keys Primary keys Relational integrity constraints

9 Jinze Liu @ University of Kentucky
Key Constraints Superkey: (Uniqueness constraints) A set of attributes where no two distinct tuples can have the same values Every relation has at least one superkey: The set of all attributes. Key: A minimal superkey Uniqueness constraint (superkey) Minimum Constraint No attribute can be removed and still satisfy the uniqueness constraints. 11/16/2018 Jinze University of Kentucky 6

10 Relational Integrity Constraints
Constraints are conditions that must hold on all valid relation instances. There are four main types of constraints: Domain constraints The value of a attribute must come from its domain Key constraints Entity integrity constraints Referential integrity constraints 11/16/2018 11/16/2018 10 10 10

11 Database Normalization
Functional Dependency Functional Closure Keys Redefined Based on functional dependency DB Norm Form 1st, 2nd, 3rd, BCNF

12 Three Types of non-key DF
X A X  A Partial dependency key X A X  A Transitive dependency I key X A X  A Transitive dependency II 11/16/2018 Luke Huan Univ. of Kansas

13 Luke Huan Univ. of Kansas
3NF R is in Third Normal Form (3NF) if for every non-trivial FD X -> A (where A is single attribute), either X is a superkey of R, or A is a member of at least one key of R Intuitively, BCNF decomposition on X -> A would “break” the key containing A X A X A X A Partial dependency Transitive dependency I Transitive dependency II 2NF 3NF BCNF 2NF  3NF  BCNF 11/16/2018 Luke Huan Univ. of Kansas

14 Luke Huan Univ. of Kansas
Database Query 11/16/2018 Luke Huan Univ. of Kansas

15 Relational Algebra and SQL
SQL query SFW Group by …, Having Subqueries Relationship between R.A. and SQL

16 Jinze Liu @ University of Kentucky
Relational algebra A language for querying relational databases based on operators: RelOp RelOp Core set of operators: Selection, projection, cross product, union, difference, and renaming Additional, derived operators: Join, natural join, intersection, etc. Compose operators to make complex queries We are gonna cover this in one day! Possible because of the minimalist approach. 11/16/2018 Jinze University of Kentucky

17 Summary of core operators
Selection: Projection: Cross product: Union: Difference: Renaming: Does not really add “processing” power σp R πL R R X S R S R - S ρ S(A1, A2, …) R 11/16/2018 Jinze University of Kentucky

18 Summary of derived operators
R p S R  S R S Join: Natural join: Intersection: 11/16/2018 Jinze University of Kentucky

19 Classification of relational operators
Selection: σp R Projection: πL R Cross product: R X S Join: R p S Natural join: R  S Union: R U S Difference: R - S Intersection: R ∩ S Monotone Monotone w.r.t. R; non-monotone w.r.t S 11/16/2018 Jinze University of Kentucky

20 Update Operations on Relations
INSERT a tuple. DELETE a tuple. MODIFY a tuple. Constraints should not be violated in updates 11/16/2018 Jinze University of Kentucky

21 Basic queries: SFW statement
SELECT A1, A2, …, An FROM R1, R2, …, Rm WHERE condition; Also called an SPJ (select-project-join) query (almost) Equivalent to relational algebra query π A1, A2, …, An (σ condition (R1 X R2 X … X Rm)) 11/16/2018 Luke Huan Univ. of Kansas

22 Luke Huan Univ. of Kansas
Semantics of SFW SELECT E1, E2, …, En FROM R1, R2, …, Rm WHERE condition; For each t1 in R1: For each t2 in R2: … … For each tm in Rm: If condition is true over t1, t2, …, tm: Compute and output E1, E2, …, En as a row t1, t2, …, tm are often called tuple variables Not 100% correct, we will see 11/16/2018 Luke Huan Univ. of Kansas

23 Operational semantics of GROUP BY
SELECT … FROM … WHERE … GROUP BY …; Compute FROM Compute WHERE Compute GROUP BY: group rows according to the values of GROUP BY columns Compute SELECT for each group For aggregation functions with DISTINCT inputs, first eliminate duplicates within the group Number of groups = number of rows in the final output 11/16/2018 Jinze University of Kentucky

24 Jinze Liu @ University of Kentucky
Database Design 11/16/2018 Jinze University of Kentucky

25 Jinze Liu @ University of Kentucky
A DBMS Overview 11/16/2018 Jinze University of Kentucky

26 physical data organization
Storage hierarchy (Lexington vs. Pluto) ! count I/O’s Disk geometry: three components of access cost; random vs. sequential I/O Data layout Record layout (handling variable-length fields, NULL’s) Block layout (NSM, PAX) ! inter-/intra-record locality Access paths Primary versus secondary indexes Tree-based indexes: ISAM, B+-tree ! Again, reintroduce redundancy to improve performance ! Fundamental trade-off: query versus update cost


Download ppt "CS405G: Introduction to Database Systems"

Similar presentations


Ads by Google