Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFS 6220 Systems Analysis & Design Transactional DBs vs. Data Warehouses.

Similar presentations


Presentation on theme: "INFS 6220 Systems Analysis & Design Transactional DBs vs. Data Warehouses."— Presentation transcript:

1 INFS 6220 Systems Analysis & Design Transactional DBs vs. Data Warehouses

2 Relational Databases (RDBMS) Collection of linked tables Tables linked by Primary Key / Foreign Key relationships (Referential Integrity) Primary Key – column whose values make each record unique Foreign Key – value in column that links to Primary Key in another table SQL – Structured Query Language (language to access data in relational tables)

3 Relational DB Example Cust #Cust Name 100Bob 101Sue 102Juan Order #Prod#QtyCust# 1QR221100 2QR2225100 3SB563102 CUSTOMER TABLEORDER TABLE Primary Key Foreign Key

4 Database Structure & Design 2 Approaches: 1. Optimize for Data Capture i.e., Capturing Transactions 2. Optimize for Data Access i.e., Queries & Reporting Conflict

5 Approach #1: Optimize for Data Capture To optimize for data storage, you must: Eliminate redundancy of data (or else wasted space & processing occurs) Ensure data integrity (or else data anomalies) Ensure that changes in data (modifications, deletions, etc. only have to happen in one place) Normalization – process in which a DBMS is optimized for data storage All data “redundancy” is removed from Database Has multiple forms (0, 1st, 2nd, 3rd, et al.)

6 Moving from 0NF to 1NF Rule: Make a separate table for each set of related attributes, and give each table a primary key of unique values. Cust # CustName 100, 100, 101Bob, Sue, Juan CUSTOMER TABLE ONF 1NF Cust #Cust Name 100Bob 101Sue 102Juan CUSTOMER TABLE Primary Key Created with Unique values

7 Moving from 1NF to 2NF Rule: Eliminate any repeating values caused by a dependency on a “keyed” column (i.e., either Primary or Foreign) Cust #Cust NameOrder# 100Bob1 100Bob2 101Sue3 TABLE X 1NF Cust #Cust Name 100Bob 101Sue Order #Cust# 1100 2100 3101 CUSTOMER TABLEORDER TABLE 2NF 100Bob Dependency on Primary Key

8 Moving from 2NF to 3NF Rule: Eliminate any repeating values caused by a dependency on a “non-keyed” column (i.e., dependency on ANY column) Cust #CityOrder#ShipTime 100PGH12 days 101PGH22 days 102LA35 days TABLE X 2NF City #CityShipTime 10PGH2 days 20LA5 days Cust #City# 10010 10110 10220 SHIP TIME TABLECUSTOMER TABLE 3NF PGH2 days Dependency b/t 2 non-key columns

9 Normalized DB Example 9

10 Approach #2: Optimize for Data Access (in a separate, read-only Data Warehouse) To optimize for data access, you must: Allow data redundancy Reduce the number of table joins (links among tables) Denormalizing – Adding redundancy & reducing joins in a DBMS

11 Denormalizing – Most Common Approach Star Schema (Clustering) Fact (core or transaction) Tables in middle of star Dimensional (structural or “lookup”) Tables around “points” of star Order #DateCust#Prod#Rep# 106/15/XX100QR221000 207/19/XX100QR221000 308/30/XX101SR562000 ORDER TABLE Cust #CustName 100Bob 101Sue 102Juan CUSTOMER TABLE Prod #ProdName QR22Rake SR56Spade TW43Mulch PRODUCT TABLE Rep #RepName 1000Lee 2000James 3000Natasha REP TABLE DateQuarter 06/29/XX2Bob 06/30/XX2Sue 07/01/XX3Juan DATE/TIME

12 Denormalizing (continued) Stars are linked via common (i.e., Conformed) Dimensions to form Data Warehouse Prod#ProdName Stock Date Units QR22Rake 03/23/XX 150 TW43Mulch 04/15/XX 1452 SR56Spade 05/01/XX 997 INVENTORY TABLE ORDER TABLE Cust #CustName 100Bob 101Sue 102Juan CUSTOMER TABLE Prod #ProdName QR22Rake SR56Spade TW43Mulch PRODUCT TABLE Rep #RepName 1000Lee 2000James 3000Natasha REP TABLE CUSTOMER TABLE TIME Order #DateCust#Prod#Rep# 106/15/XX100QR221000 207/19/XX100QR221000 308/30/XX101SR562000 DateQuarter 06/29/XX2Bob 06/30/XX2Sue 07/01/XX3Juan ORDER TABLE DATE/TIME


Download ppt "INFS 6220 Systems Analysis & Design Transactional DBs vs. Data Warehouses."

Similar presentations


Ads by Google