Download presentation
Presentation is loading. Please wait.
1
INFS 3220 Systems Analysis & Design
Transactional DBs vs. Data Warehouses
2
Relational Databases (RDBMS)
Collection of linked tables Tables linked by Primary Key / Foreign Key relationships (Referential Integrity) Primary Key – column whose values make each record unique Foreign Key – value in column that links to Primary Key in another table SQL – Structured Query Language (language to access data in relational tables)
3
Relational DB Example Cust # Cust Name 100 Bob 101 Sue 102 Juan
Order # Prod# Qty Cust# 1 QR 2 QR 3 SB CUSTOMER TABLE ORDER TABLE Primary Key Foreign Key
4
Database Structure & Design
2 Approaches: Conflict 1. Optimize for Data Capture i.e., Capturing Transactions 2. Optimize for Data Access i.e., Queries & Reporting
5
Approach #1: Optimize for Data Capture
To optimize for data storage, you must: Eliminate redundancy of data (or else wasted space & processing occurs) Ensure data integrity (or else data anomalies) Ensure that changes in data (modifications, deletions, etc. only have to happen in one place) Normalization – process in which a DBMS is optimized for data storage All data “redundancy” is removed from Database Has multiple forms (0, 1st, 2nd, 3rd, et al.)
6
Moving from 0NF to 1NF Rule: Make a separate table for each set of related attributes, and give each table a primary key of unique values. Cust # CustName 100, 100, 101 Bob, Sue, Juan CUSTOMER TABLE ONF Cust # Cust Name 100 Bob 101 Sue 102 Juan CUSTOMER TABLE 1NF Primary Key Created with Unique values
7
Moving from 1NF to 2NF Rule: Eliminate any repeating values caused by a dependency on a “keyed” column (i.e., either Primary or Foreign) Cust # Cust Name Order# 100 Bob 1 100 Bob 2 101 Sue 3 TABLE X 1NF 100 Bob Dependency on Primary Key Cust # Cust Name 100 Bob 101 Sue Order # Cust# 1 100 2 100 3 101 CUSTOMER TABLE ORDER TABLE 2NF
8
Moving from 2NF to 3NF Rule: Eliminate any repeating values caused by a dependency on a “non-keyed” column (i.e., dependency on ANY column) Cust # City Order# ShipTime 100 PGH 1 2 days 101 PGH 2 2 days 102 LA 3 5 days TABLE X 2NF PGH 2 days Dependency b/t 2 non-key columns City # City ShipTime 10 PGH 2 days 20 LA 5 days Cust # City# 100 10 101 10 102 20 SHIP TIME TABLE CUSTOMER TABLE 3NF
9
Normalized DB Example
10
To optimize for data access, you must:
Approach #2: Optimize for Data Access (in a separate, read-only Data Warehouse) To optimize for data access, you must: Allow data redundancy Reduce the number of table joins (links among tables) Denormalizing – Adding redundancy & reducing joins in a DBMS
11
Denormalizing – Most Common Approach
Star Schema (Clustering) Fact (core or transaction) Tables in middle of star Dimensional (structural or “lookup”) Tables around “points” of star Cust # CustName 100 Bob 101 Sue 102 Juan Rep # RepName 1000 Lee 2000 James 3000 Natasha REP TABLE CUSTOMER TABLE Order # Date Cust# Prod# Rep# 1 06/15/XX 100 QR 2 07/19/XX 100 QR 3 08/30/XX 101 SR ORDER TABLE Date Quarter 06/29/XX 2 Bob 06/30/XX 2 Sue 07/01/XX 3 Juan Prod # ProdName QR22 Rake SR56 Spade TW43 Mulch PRODUCT TABLE DATE/TIME
12
Denormalizing (continued) • Stars are linked via common (i. e
Denormalizing (continued) • Stars are linked via common (i.e., Conformed) Dimensions to form Data Warehouse Cust # CustName 100 Bob 101 Sue 102 Juan Rep # RepName 1000 Lee 2000 James 3000 Natasha REP TABLE CUSTOMER TABLE Order # Date Cust# Prod# Rep# 1 06/15/XX 100 QR 2 07/19/XX 100 QR 3 08/30/XX 101 SR ORDER TABLE ORDER TABLE Prod # ProdName QR22 Rake SR56 Spade TW43 Mulch PRODUCT TABLE Date Quarter 06/29/XX 2 Bob 06/30/XX 2 Sue 07/01/XX 3 Juan CUSTOMER TABLE DATE/TIME TIME Prod# ProdName Stock Date Units QR22 Rake 03/23/XX TW43 Mulch 04/15/XX 1452 SR56 Spade 05/01/XX INVENTORY TABLE
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.