Download presentation
Presentation is loading. Please wait.
Published byRosamund Poole Modified over 9 years ago
1
INFS 6220 Systems Analysis & Design Transactional DBs vs. Data Warehouses
2
Relational Databases (RDBMS) Collection of linked tables Tables linked by Primary Key / Foreign Key relationships (Referential Integrity) Primary Key – column whose values make each record unique Foreign Key – value in column that links to Primary Key in another table SQL – Structured Query Language (language to access data in relational tables)
3
Relational DB Example Cust #Cust Name 100Bob 101Sue 102Juan Order #Prod#QtyCust# 1QR221100 2QR2225100 3SB563102 CUSTOMER TABLEORDER TABLE Primary Key Foreign Key
4
Database Structure & Design 2 Approaches: 1. Optimize for Data Capture i.e., Capturing Transactions 2. Optimize for Data Access i.e., Queries & Reporting Conflict
5
Approach #1: Optimize for Data Capture To optimize for data storage, you must: Eliminate redundancy of data (or else wasted space & processing occurs) Ensure data integrity (or else data anomalies) Ensure that changes in data (modifications, deletions, etc. only have to happen in one place) Normalization – process in which a DBMS is optimized for data storage All data “redundancy” is removed from Database Has multiple forms (0, 1st, 2nd, 3rd, et al.)
6
Moving from 0NF to 1NF Rule: Make a separate table for each set of related attributes, and give each table a primary key of unique values. Cust # CustName 100, 100, 101Bob, Sue, Juan CUSTOMER TABLE ONF 1NF Cust #Cust Name 100Bob 101Sue 102Juan CUSTOMER TABLE Primary Key Created with Unique values
7
Moving from 1NF to 2NF Rule: Eliminate any repeating values caused by a dependency on a “keyed” column (i.e., either Primary or Foreign) Cust #Cust NameOrder# 100Bob1 100Bob2 101Sue3 TABLE X 1NF Cust #Cust Name 100Bob 101Sue Order #Cust# 1100 2100 3101 CUSTOMER TABLEORDER TABLE 2NF 100Bob Dependency on Primary Key
8
Moving from 2NF to 3NF Rule: Eliminate any repeating values caused by a dependency on a “non-keyed” column (i.e., dependency on ANY column) Cust #CityOrder#ShipTime 100PGH12 days 101PGH22 days 102LA35 days TABLE X 2NF City #CityShipTime 10PGH2 days 20LA5 days Cust #City# 10010 10110 10220 SHIP TIME TABLECUSTOMER TABLE 3NF PGH2 days Dependency b/t 2 non-key columns
9
Normalized DB Example 9
10
Approach #2: Optimize for Data Access (in a separate, read-only Data Warehouse) To optimize for data access, you must: Allow data redundancy Reduce the number of table joins (links among tables) Denormalizing – Adding redundancy & reducing joins in a DBMS
11
Denormalizing – Most Common Approach Star Schema (Clustering) Fact (core or transaction) Tables in middle of star Dimensional (structural or “lookup”) Tables around “points” of star Order #DateCust#Prod#Rep# 106/15/XX100QR221000 207/19/XX100QR221000 308/30/XX101SR562000 ORDER TABLE Cust #CustName 100Bob 101Sue 102Juan CUSTOMER TABLE Prod #ProdName QR22Rake SR56Spade TW43Mulch PRODUCT TABLE Rep #RepName 1000Lee 2000James 3000Natasha REP TABLE DateQuarter 06/29/XX2Bob 06/30/XX2Sue 07/01/XX3Juan DATE/TIME
12
Denormalizing (continued) Stars are linked via common (i.e., Conformed) Dimensions to form Data Warehouse Prod#ProdName Stock Date Units QR22Rake 03/23/XX 150 TW43Mulch 04/15/XX 1452 SR56Spade 05/01/XX 997 INVENTORY TABLE ORDER TABLE Cust #CustName 100Bob 101Sue 102Juan CUSTOMER TABLE Prod #ProdName QR22Rake SR56Spade TW43Mulch PRODUCT TABLE Rep #RepName 1000Lee 2000James 3000Natasha REP TABLE CUSTOMER TABLE TIME Order #DateCust#Prod#Rep# 106/15/XX100QR221000 207/19/XX100QR221000 308/30/XX101SR562000 DateQuarter 06/29/XX2Bob 06/30/XX2Sue 07/01/XX3Juan ORDER TABLE DATE/TIME
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.