Presentation is loading. Please wait.

Presentation is loading. Please wait.

BACKGROUND OVERVIEW OF RELATIONAL DBS. BASICS Highly structured Schema based - we can leverage this to address volume Semantics SQL App Middleware Users.

Similar presentations


Presentation on theme: "BACKGROUND OVERVIEW OF RELATIONAL DBS. BASICS Highly structured Schema based - we can leverage this to address volume Semantics SQL App Middleware Users."— Presentation transcript:

1 BACKGROUND OVERVIEW OF RELATIONAL DBS

2 BASICS Highly structured Schema based - we can leverage this to address volume Semantics SQL App Middleware Users Structure Tables/relations, rows/tuples, columns/attributes User defined data types PKs and FKs Null or not null Triggers as a catch-all integrity constraint Normalization for formal table minimization Uses Bank checks Insurance claims Credit card payments 2

3 NOSQL Cluster based broad distribution Semi structured More flexible access of data Hierarchical Similar structure 3

4 RELATIONAL DBS: FORMALLY UNDERSTOOD Set theoretic Originally defined with an algebra, with Selection, Projection, Join, and Union/Difference/Intersection Declarative calculus that is based on the algebra and supports large grained queries Clean implementation spec Unambiguous optimization - with its own algebra of query parse tree transformations 4

5 SEMANTICS ARE IN QUERIES Relational algebra compliant Queries written in declarative calculus Set-oriented But at least Programmers tend to follow PK/FK pairs, and infer semantics from attribute names and associations in tuples Query results are legal tables (Views) 5

6 ALSO WE GET (GOOD AND BAD) Fixed size tuples for easy row-optimization 2P transactions Table, Row distribution Two language based, with lowest common denominator semantics Security Checkpointing Powerful query optimizers 6

7 OBJECT-RELATIONAL DBS This runs somewhat counter to NoSQL trends - we make the data types even more complex We make domains out of type constructors Object IDs A row can be a tuple - or an object, with an object ID and a tuple, making all relational DBs also O-R 7

8 OBJECT-ORIENTED DBS No tuple rows Blend SQL and the app language This avoids lowest common denominator semantics These bombed, as relational DBs were not O-O And they are tough to optimize 8

9 THE RELATIONAL ALGEBRA AND CALCULUS: THE HEART OF RELATIONAL DBS … SQL

10 Selection and projection are unary ops Join is binary Selection is based on a formula and returns a table that contains all tuples from a given table where the formula is valid Projection returns a table consisting of a subset of attributes from a given table, with dupes removed Join creates tuples with attributes from two given tables, where a specific attribute in one matches a specific attribute in another (often a PK, FK pair) 10

11 ALGEBRAIC CLOSURE Any relational algebra operation returns a legal derived table The set operators are also part of the algebra From a formal perspective, the join operator is not a minimal operator, and is therefore represented as a cross product followed by a selection (where the PK equals the FK) Note that joins are symmetric 11

12 JOINS CAN BE GENERALIZED Complex join conditions Non-equi joins A “natural” join is based on matching all attributes with equal names in both tables “Outer” join creates null-packed tuples when tuples on the left do not match any on the right; there is also a right outer join 12

13 THE CALCULUS It is a tuple calculus, not a domain calculus SQL is equivalent Select From Where The part after the Where is declarative A tuple calculus (SQL) Notice that the variables are indeed tuples Note that set operators often act on tables that are being created in the query 13

14 SQL CREATE PROCEDURE test() BEGIN DECLARE sql_error TINYINT DEFAULT FALSE; DECLARE CONTINUE HANDLER FOR SQLEXCEPTION SET sql_error = TRUE; START TRANSACTION; INSERT INTO invoices VALUES (115, 34, 'ZXA-080', '2012-01-18', 14092.59, 0, 0, 3, '2012-04-18', NULL); INSERT INTO invoice_line_items VALUES (115, 1, 160, 4447.23, 'HW upgrade'); INSERT INTO invoice_line_items VALUES (115, 2, 167, 9645.36, 'OS upgrade'); IF sql_error = FALSE THEN COMMIT; SELECT 'The transaction was committed.'; ELSE ROLLBACK; SELECT 'The transaction was rolled back.'; END IF; END//

15 MORE IN operator is “element of” EXISTS Nesting FOR ALL FOR SOME Putting computations in the SELECT clause COUNT, SUM, AVG, MAX, MIN operators 15

16 STORED PROGRAMS Stored procedures (can be called by an application) Stored functions (can be called by an SQL program) Triggers (tied to an operation like INSERT) Events (tied to a clock)

17 VARIABLES DECLARE statement SET statement DEFAULT statement INTO (from a SELECT clause)

18 EXAMPLE… CREATE PROCEDURE test() BEGIN DECLARE max_invoice_total DECIMAL(9,2); DECLARE min_invoice_total DECIMAL(9,2); DECLARE percent_difference DECIMAL(9,4); DECLARE count_invoice_id INT; DECLARE vendor_id_var INT; SET vendor_id_var = 95; SELECT MAX(invoice_total), MIN(invoice_total), COUNT(invoice_id) INTO max_invoice_total, min_invoice_total, count_invoice_id FROM invoices WHERE vendor_id = vendor_id_var;

19 EXAMPLE, CONTINUED SET percent_difference = (max_invoice_total - min_invoice_total) / min_invoice_total * 100; SELECT CONCAT('$', max_invoice_total) AS 'Maximum invoice', CONCAT('$', min_invoice_total) AS 'Minimum invoice', CONCAT('%', ROUND(percent_difference, 2)) AS 'Percent difference', count_invoice_id AS 'Number of invoices'; END//

20 DOMAIN TYPES – CHAPTER 8 Character Integers Reals Date Time Large object, BLOB and CLOB 2D vector spatial types Enumerated

21 THE ACID PROPERTIES, NORMALIZATION, AND DATABASE DESIGN

22 ACID TRANSACTIONS Atomic: Either all of a transaction or None of it affects the database Consistent: When a transaction ends, the database obeys all constraints Isolated: Two running transactions cannot pass values to each other, via the database or other data store Durable: Once a transaction has “committed”, its updates are permanent

23 ATOMICITY Use a local log to store a transaction’s partial result If a transaction does something illegal, toss out the log

24 CONSISTENT Check constraints in phase 1 Some are immediate, like domains Others don’t have to be true until the commit point, like FKs

25 ISOLATED Transactions commit in a linear order Serializability is enforced Results become available only after atomic commit point

26 DURABLE Database has one state and it is in nonvolatile storage Keep checkpoints and transaction logs

27 DEADLOCK Loops of transactions wait on each other Detection: use time-outs Prevention: use “waits for” graph

28 STORED PROGRAMS Stored procedures (can be called by an application) Stored functions (can be called by an SQL program) Triggers (tied to an operation like INSERT) Events (tied to a clock)

29 VARIABLES DECLARE statement SET statement DEFAULT statement INTO (from a SELECT clause)

30 ANOTHER VIEW OF TRANSACTIONS Prevents Lost updates from one of two transactions Dirty reads when a transaction reads an uncommitted value Nonrepeatable reads in one transaction because the value gets updated in between Phantom reads are when a subset of updated rows are simultaneously updated by another transaction

31 CONTINUED… Options Serializable isolates transactions completely and is the highest level of protection Read uncommitted lets our four problems occur – no locks Read committed prevents dirty reads Repeatable read is the default and it means that a transaction will always read a given value the same because the values are locked

32 DEADLOCK Detect by closing transactions that have been open a long time Use the lowest acceptable locking level Try to do heavy update transactions when database can be completely reserved

33 STORED PROGRAMS Stored procedures (can be called by an application) Stored functions (can be called by an SQL program) Triggers (tied to an operation like INSERT) Events (tied to a clock)

34 THE DB DESIGN PROCESS Start with an entity model Map to tables Create PKs and FKs Create other constraints Normalize tables

35 OUR FOCUS: NORMALIZATION Goals Minimize redundant data Minimize “update anomalies”

36 FUNCTIONAL AND MULTIVALUED DEPENDENCIES FD  We say that ai FD-> aj Or “ai functionally determines aj” MVD-> We say that ai MVD-> aj Or “ai multivalued determines aj” Note: the right side of an FD or an MVD can be a set of attributes

37 FIRST 3 NORMAL FORMS First (1NF)The value stored at the intersection of each row and column must be a scalar value, and a table must not contain any repeating columns. Second (2NF)Every non-key column must depend on the entire primary key. Third (3NF)Every non-key column must depend only on the primary key.

38 NF3 FIXED AND NF4 Boyce-Codd (BCNF)A non-key column can’t be dependent on another non-key column. Fourth (4NF)A table must not have more than one multivalued dependency, where the primary key has a one-to-many relationship to non-key columns.

39 EXAMPLE: 1NF

40 EXAMPLE: 2NF

41 EXAMPLE: 2NF, CONTINUED

42 3NF: REMOVE TRANSITIVE DEPENDENCIES Customer ID Address ZIP 18 112 First 80304 17 123 Ash 80303 16 123 Ash 80303

43 3NF, CONTINUED Break into two tables: Customer ID Address Address Zip

44 4NF: SEPARATE PAIRS OF MVDS Mothers_Phone Fathers_Phone Child_Name Break into: Mothers_Phone Child_Name 3030000000 Sue 3031111111 Sue And Fathers_Phone Child_Name 3032222222 Sue 3033333333 Sue Note: both fields needed for PK

45 TRADEOFFS “Decomposition” makes it harder to misunderstand the database schema But Decomposition create narrow tables that might not correspond to forms in the real world And Decomposition leads to extra joins One solution is to pre-join data


Download ppt "BACKGROUND OVERVIEW OF RELATIONAL DBS. BASICS Highly structured Schema based - we can leverage this to address volume Semantics SQL App Middleware Users."

Similar presentations


Ads by Google