3 Copyright © 2006, Oracle. All rights reserved. Business, Logical, and Dimensional Modeling
Copyright © 2006, Oracle. All rights reserved Objectives After completing this lesson, you should be able to do the following: Discuss data warehouse environment data structures Discuss data warehouse database design phases: –Defining the business model –Defining the logical model –Defining the dimensional model
Copyright © 2006, Oracle. All rights reserved Data Warehouse Modeling Issues Among the main issues that data warehouse data modelers face are: Different data types Many ways to use warehouse data Many ways to structure the data Multiple modeling techniques Planned replication Large volumes of data
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Data Warehouse Environment Data Structures The data modeling structures that are commonly found in a data warehouse environment are: Third normal form (3NF) Star schema Snowflake schema
Copyright © 2006, Oracle. All rights reserved Star Schema Model Product Table Product_id Product_disc,... Time Table Day_id Month_id Year_id,... Sales Fact Table Product_id Store_id Item_id Day_id Sales_amount Sales_units,... Item Table Item_id Item_desc,... Store Table Store_id District_id,... Central fact table Denormalized dimensions
Copyright © 2006, Oracle. All rights reserved Snowflake Schema Model Time Table Week_id Period_id Year_id Dept Table Dept_id Dept_desc Mgr_id Mgr Table Dept_id Mgr_id Mgr_name Product Table Product_id Product_desc Item Table Item_id Item_desc Dept_id Sales Fact Table Item_id Store_id Product_id Week_id Sales_amount Sales_units Store Table Store_id Store_desc District_id District Table District_id District_desc
Copyright © 2006, Oracle. All rights reserved Snowflake Schema Model Can be used directly by some tools Is more flexible to change Provides for speedier data loading Can become large and unmanageable Degrades query performance Has more complex metadata CountryStateCountyCity
Copyright © 2006, Oracle. All rights reserved Data Warehouse Design Phases Phase 1: –Defining the business models Phase 2: –Defining the logical model Phase 3: –Defining the dimensional model Phase 4: –Defining the physical model
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Phase 1: Defining the Business Model Performing strategic analysis Creating the business model Documenting metadata
Copyright © 2006, Oracle. All rights reserved Performing Strategic Analysis Identify crucial business processes. Understand business processes. Prioritize and select the business processes to implement. Business benefit LowHigh Low High Feasibility
Copyright © 2006, Oracle. All rights reserved Creating the Business Model Defining business requirements Determining granularity Documenting metadata
Copyright © 2006, Oracle. All rights reserved Existing metadataProduction ERD model Interviews to collect business requirements Research Business Requirements Drive the Design Process Primary input Secondary input
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Using a Business Process Matrix Sample of business process matrix Promotions Channels Products Times (Date) Inventory Customers ReturnsSales Business Processes Business Dimensions
Copyright © 2006, Oracle. All rights reserved Identifying Business Measures and Dimensions The attribute is perceived as constant or discrete: Products Promotions Customers Countries Channels Times The attribute varies continuously: Sales Quantity sold Units sold Cost Measures Dimensions
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Determining Granularity YEAR? QUARTER? MONTH? WEEK? DAY? TIMES Product category? Product subcategory? Product name? Product desc? Product item? PRODUCTS
Copyright © 2006, Oracle. All rights reserved Identifying Business Definitions and Rules Credit Rating Meaning A+ 0 bad checks or bank credit failures A 1 bad check or bank credit failures B 2 bad checks or bank credit failures C3 or more bad checks or bank credit failures Customer Rule 1 A customer with a credit rating of A or above will receive a 10% discount on any order totaling $500 (U.S.) or more. Rule 2 A customer with a credit rating of A or above will receive a 5% discount on any order totaling $250 (U.S.) but less than $500. … Rule 5 A customer with a credit rating of C will not receive any discounts on purchases. Order …
Copyright © 2006, Oracle. All rights reserved Documenting Metadata Documenting metadata should include: Documenting the design process Documenting the development process Providing a record of changes Recording enhancements over time
Copyright © 2006, Oracle. All rights reserved Business Metadata Elements Name of the measure Business dimensions –Dimension attributes Sample data Business definition and rules
Copyright © 2006, Oracle. All rights reserved Metadata Documentation Approaches Automated –Data modeling tools –ETL tools –End-user tools Manual
Copyright © 2006, Oracle. All rights reserved Phase 2: Designing the Logical Model Entity Relationship Modeling (ERM) uses entity relationship diagram (ERD): Each CUSTOMER belongs to one COUNTRY. Each COUNTRY can have many CUSTOMERS. Customers Cust_name Country_Id Cust_Addr … Countries Country_id Name Region ISO_Code … Belongs to have Entity Attributes Relationship
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Phase 3: Defining the Dimensional Model Identify fact tables: –Translate business measures into fact tables. –Analyze source system information for additional measures. Identify dimension tables. Link fact tables to the dimension tables. Model the time dimension.
Copyright © 2006, Oracle. All rights reserved Star Dimensional Modeling
Copyright © 2006, Oracle. All rights reserved Advantages of Using a Star Dimensional Model Supports multidimensional analysis Creates a design that improves performance Enables optimizers to yield better execution plans Parallels end-user perceptions Provides an extensible design Broadens the choices for data access tools
Copyright © 2006, Oracle. All rights reserved Fact Table Characteristics Fact tables: Contain numerical metrics of the business Hold large volumes of data Grow quickly Can contain base, derived, and summarized data Are typically additive Are joined to dimension tables through foreign keys that reference primary keys in the dimension tables What are factless fact tables? Sales (Fact Table) PROD_ID CUST_ID TIME_ID CHANNEL_ID PROMO_ID QUANTITY_SOLD AMOUNT_SOLD...
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved More on Factless Fact Tables Emp_FK Sal_FK Age_FK Ed_FK Grade_FK Grade dimension Grade_PK Education dimension Ed_PK Employee dimension Emp_PK Salary dimension Sal_PK Age dimension Age_PK
Copyright © 2006, Oracle. All rights reserved Identifying Base and Derived Measures Business Measures Quantity sold Amount sold Profit Sales Fact Table Base Derived Business measures Facts (Base, Derived) Quantity sold Amount sold Profit
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Fact Table Measures Fact table measures can be: Nonadditive: Cannot be added along any dimension Semiadditive: Added along some dimensions Additive: Added across all dimensions
Copyright © 2006, Oracle. All rights reserved Dimension Table Characteristics Dimension tables: Contain textual information that represents the attributes of the business Contain relatively static data Are joined to a fact table through a foreign key reference
Copyright © 2006, Oracle. All rights reserved Translating Business Dimensions into Dimension Tables Business dimensions Dimension tables Customers Products Channels Countries Promotions Product_status Prod_List_price Products Dimension Table Prod_Id Prod_desc Prod_Subcategory Prod_name Prod_Category Prod_category_id Prod_Pack_Size … Prod_Status Prod_Weight _class Prod_Category_Desc
Copyright © 2006, Oracle. All rights reserved Slowly Changing Dimensions
Copyright © 2006, Oracle. All rights reserved Slowly Changing Dimension (SCD): An Example Where Product_key is a calculated number stored within the database Business Dimension Product Products: SCD Product weight, Product package size varying over time Product_status Prod_List_price Products Dimension Table Prod_Id (Natural) Prod_desc Prod_Subcategory Prod_name Prod_Category Prod_category_id Prod_Pack_Size … Prod_Status Prod_Weight_class Prod_Category_Desc Id (Surrogate) Prod_Eff_From Prod_Eff_To
Copyright © 2006, Oracle. All rights reserved Types of Database Keys Primary keys (PKs) Foreign keys (FKs) Composite keys Surrogate keys
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Using Time in the Data Warehouse Defining standards for time is critical. Aggregation based on time is complex.
Copyright © 2006, Oracle. All rights reserved Time Dimension Time dimension is critical to the data warehouse. Choose the right granularity for the Time dimension. Where should the element of time be stored? Time dimension Sales fact Current dimension grain Future dimension grain Fiscal month Fiscal year Fiscal quarter Fiscal week Day
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Identify Hierarchies for Dimensions State/Province Region Country City Geography hierarchy Fiscal year Fiscal quarter Fiscal month Fiscal time Fiscal date Calendar year Calendar quarter Calendar month Calendar time Calendar date Multiple time hierarchies
Copyright © 2006, Oracle. All rights reserved State 5State1State 2 Region 2 Country2Country 4 Data Drilling State 4 Group Market Hierarchy Region 1 Country1 State 6State 3 Country3 City1City2
Copyright © 2006, Oracle. All rights reserved Using Data Modeling Tools Tools with a GUI enable definition, modeling, and reporting. Avoid a mix of modeling techniques caused by the following: –Development pressures –Developers with lack of knowledge –No strategy Determine a strategy for data modeling. Write and publish data models formally. Make the data models available electronically.
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved Summary In this lesson, you should have learned about: Data warehouse environment data structures Data warehouse database design phases: –Defining the business model –Defining the logical model –Defining the dimensional model
Copyright © 2006, Oracle. All rights reserved Practice 3-1: Overview This practice covers the following topics: Identifying the facts, measures, hierarchies, and slowly changing dimensions based on the RISD scenario given Exploring viewlet-based demonstrations provided for modeling concepts, and answering the questions in these interactive viewlets
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved
Copyright © 2006, Oracle. All rights reserved