Download presentation
Presentation is loading. Please wait.
Published byVirginia Willis Modified over 9 years ago
1
Chapter 1 Adamson & Venerable Spring 2012
2
Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures Multiple Processes Aggregation
3
Transaction Systems They capture critical detailed info about business events: –A product is manufactured –An account is credited while another is debited –A seat is reserved Most businesses have a variety of OLTP systems An OLTP system looks at the micro-level transactions that drive the process OLTP’s normalized design ensures transaction consistency
4
Dimensional Modeling Dimensional Model Basics –Relational Model vs. Dimensional Model ERDs vs. Star Schema Normalization issues Reports & Queries in Relational Models can be complex: –Especially when seeking historical comparisons –Business Process Measurements (Not in RDBMS) Gross Margins Average Aged A/R balances Inventory levels compared to Sales by Warehouse Location ROI, etc.
5
Business Questions Questions managers ask about a business: –What are the best-selling products last week? –How often did we fulfill an order in more than two shipments? –What is the quarterly trend on orders by region? –Show me gross margin by product category –How does inventory level compare to sales by warehouse? Answering these questions requires a different design than in OLTP The dimensional model serves this purpose
6
Process Metrics Each question centers around a business process: –Orders, Account Management, Inventory Management, Accounts Receivable, Returns Each question reveals how a process is measured: –Gross Margin measures the sales process for distributors –Banks monitor balances as part of their account mgt. –Warehouse managers track inventory levels –Accountants quantify receivables For some processes there is a single metric Some others have multiple metrics
7
The Dimensional Model Reflects the way business people think Facts or Measures: The measurements that are of importance to a business Dimensions: The parameters by which a measurement can be viewed or referred to Any fact (measure) can be combined with any dimension EXAMPLE: An Order Entry Process
8
Fact & Dimension Tables Order Entry Process – Table 1.1 Facts (Measures)Dimensions Gross sales dollarsCustomer Name Total CostCustomer state, zip, etc. Margin dollarsDate of order Quantity SoldQuarter of order Month of order Year of order Salesperson name Sales Territory Corporate region Product name Product brand Product category
9
Dimensional Modeling Fact & Dimension Tables –Use Normalization approach to determine appropriate dimension tables –Functional dependencies Customer_key Name, address, city, state, zip Salesperson_key Name, code, territory, region Date_key date, day_week, day_of_month, etc. Order_key product_key, order_date_key, salesperson_key, customer_key, order_dollars, extended_cost, margin_dollars, quantity_ordered, order_number, order_line
10
Star Diagram for Orders
11
Star Schema Primary Keys –Foreign Keys Orders [Order_Date_key, Product_key, salesperson_key, customer_key, ….] –Surrogate Key Orders [Order_ID, Order_Date_key, Product_key, Salesperson_key, Customer_key, etc….] –Avoid Keys from other Systems Granularity (the Grain) –What level of detail is recorded on the Fact Table Individual Transactions Daily summaries, weekly summaries, monthly summaries, etc. Must be uniform throughout
12
Ensuring a chosen Grain Foreign keys relating a row in the fact table to dimension rows should never be null Optional relationships to a dimension are usually a sign of a grain problem The rule against optionality can be relaxed when a dimension is not a part of the grain of the fact table. Then, add “fake” rows such as a product called Not-a-Product or a promotion table item called Not-a-Promotion
13
The Dimensions: WIDE tables Produce “Wide” dimension tables: 100 or 200 attributes (dimensions) are common Relationships among dimensions in a dimension table does not have to be strong: –Customer table may include region, regional data, sales force data, etc. Resist the urge to normalize dimension tables! –Role of dimensions attributes are to qualify queries/break out measures –Spreading dimensional attributes across several tables would complicate and slow down queries and users would have to wait longer –Dimension hierarchies are flattened out: year 1998 may appear 365 times in the time dimension table, departments may appear multiple times in a salesperson dimension table
14
Dimensional Modeling Slowly changing dimensions –Type 1 Change Correct an error or a value that has no significance in future analysis –E.g., phone number change for customer –Type 2 Change Add new row in dimension table –Significant change, such as a new location (address) for customer –Type 3 Change Add new dimension or modify schema (add columns) –New organization structure for firm; want to be able to compare new and old
15
3 Kinds of Measures Fully additive –Facts may be added equally well across any dimension –E.g., gross sales dollars, extended cost, margin dollars, quantity sold Non additive –Facts that may not be added. Generally ratios. E.g., Margin rate –Need to break components into individual facts: margin dollars, sales dollars Semi-additive –Fact that can be summarized across some dimensions –E.g., Banking account balances can be totaled for point in time, but not across time periods. The Fact table: A DEEP table
16
Dimensional Modeling Multiple Processes –E.g., Orders vs. Shipments –Different process –Different grains E.g., Time dimension –Hour –Day –Week –Month –Year
17
Multiple Processes: Orders & Shipments
18
Dimensional Modeling Aggregation –As size of Data Warehouse grows we may need to aggregate date in order to improve query performance –E.g., change from Daily to Monthly summaries Keep same Dimension Tables –Time table may be modified Fact table keeps same measures as before Facts now represent monthly summarizations of data.
19
Aggregate Table Adjust primary key accordingly De-normalize from Day to Month
20
Summary Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures Multiple Processes Aggregation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.