Download presentation
Presentation is loading. Please wait.
Published byRayna Fackrell Modified over 10 years ago
2
Populating Data Warehouse Structures
3
Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema
4
Implementing the Star Schema 1. Extract Data From Multiple Sources 2. Integrate, Transform, and Restructure Data 3. Load Data Into Dimension Tables and Fact Tables
5
The Star Schema Data Load NorthwindOLTP Staging Area Polaris Data Warehouse Heterogeneous Data Sources ExternalFiles ExternalFiles InternalFiles InventoryStar SalesStar Extracting Data From Transforming Loading the Heterogeneous Sources Data Star Schema DTS Financial
6
Verifying the Dimension Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values
7
Dimension Data Load Examples:buyer_namebuyer_name Barr, Adam Chai, Sean OMelia, Erin... reg_idreg_id 2 2 4 4 6 6 buyer_firstbuyer_first Adam Sean Erin... buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id 2 2 4 4 6 6 DTS buyer_codebuyer_code A123 B456... buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id 2 2 4 4 6 6 buyer_codebuyer_code U999 A123 B456... buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id 2 2 4 4 6 6 buyer_namebuyer_name Barr, Adam Chai, Sean Smith, Jane Paper, Anne reg_idreg_id 2 2 4 4 2 2 4 4 DTS buyer_namebuyer_name Barr, Adam Chai, Sean reg_idreg_id II IV buyer_namebuyer_name Smith, Jane Paper, Anne reg_idreg_id 2 2 4 4
8
Maintaining Integrity of the Dimension Assigning a Surrogate Key to Each Record Defines the dimensions primary key Relates to the foreign key fields of the fact table Loading One Record Per Application Key Maintains uniqueness in the dimension Depends on how you manage changing dimension data Maintains integrity of the fact table
9
Managing Changing Dimension Data Dimensions with Changing Column Values Inserts of new data Updates of existing data Slowly-Changing Dimension Design Solutions Type 1: Overwrite the dimension record Type 2: Write another dimension record Type 3: Add attributes to the dimension record
10
Type 1: Overwriting the Dimension Slide Existing record is changed product key product name product size product package product dept product cat product subcat... product key product name product size product package product dept product cat product subcat... Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks... 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks... Before After 001 Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks... 001 Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks... 12 oz.
11
Type 2: Writing Another Dimension Record Adds a new record product key product name product size product package product dept product cat product subcat effective_date … product key product name product size product package product dept product cat product subcat effective_date … Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks 05-01-1995... 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks 05-01-1995... Before After 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks 05-01-1995... 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks 05-01-1995... 10 oz. 12 oz. Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks 10-15-1998... Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks 10-15-1998... 731
12
Type 3: Adding Attributes in the Dimension Record Additional information is stored in an existing record Product Dimension product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product size previous product size previous product size date Before 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks 05-01-1995 11 Oz 03-20-1994 (null)... 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks 05-01-1995 11 Oz 03-20-1994 (null)... 10 oz. 11 oz. 03-20-1994 After 001 Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks 10-15-1998 10 oz. 05-01-1995 11 Oz 03-20-1994... 001 Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks 10-15-1998 10 oz. 05-01-1995 11 Oz 03-20-1994... 12 oz 10-15-1998 11 oz. 03-20-1994 05-01-1995
13
Verifying the Fact Table Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values
14
Assigning Foreign Keys Dimension Tables customer_dimcustomer_dim 201 ALFI Alfreds product_dimproduct_dim 25 123 Chai Source Data customer id ALFI1231/1/2000400 134 1/1/2000 time_dimtime_dim product id order date quantity_sales amount_sales 10,7891231/1/200040010,789 cust_key 1231/1/2000400 prod_key time_key quantity_sales amount_sales 2513440010,789201 Sales Fact Data
15
Defining Measures Loading Measures from the Source System Calculating Additional Measures Source System Data Fact Table Datacustomer_idcustomer_id VINET ALFI HANAR... product_idproduct_id 9GZ 1KJ 0ZA... priceprice.55 1.10.98... qtyqty 32 48 9 9... customer_keycustomer_key 100 238 437... product_keyproduct_key 512 207 338... qtyqty 32 48 9 9... total_salestotal_sales 17.60 52.80 8.82...
16
Maintaining Data Integrity Adhering to the Fact Table Grain A fact table can only have one grain You must load a fact table with data at the same level of detail as defined by the grain Enforcing Column Constraints NOT NULL constraints FOREIGN KEY constraints
17
Implementing Staging Tables Centralize and Integrate Source Data Break Up Complex Data Transformations Facilitate Error Recovery Staging Area sales_stage inventory_stage market_stage shipments_stage
18
DTS Functionality Accessing Heterogeneous Data Sources Importing, Exporting, and Transforming Data Creating Reusable Transformations and Functions Automating Data Loads Managing Metadata Customizing and Extending Functionality
19
Defining DTS Packages Identifies Data Sources and Destinations Defines Tasks or Actions Implements Transformation Logic Defines Order of Operations
20
Identifying Package Components Connections Access Data Sources and Destinations Tasks Describe Data Transformations or Functions Steps Define the Order of Task Operations or Workflow Global Variables Store Data that Can Be Shared Across Tasks
21
Creating Packages Using the DTS Import / Export Wizard Perform ad-hoc table and data transfers Develop a prototype package Using DTS Package Designer Edit packages created with the DTS Import/Export Wizard Create packages with a wide range of functionality Programming DTS Applications Directly access the functionality of the DTS Object Model Requires Microsoft Visual Basic or Microsoft Visual C++
22
Using DTS to Populate the Sales Star Populating the Sales Star Dimensions Populating the Sales Star Fact Table
23
Populating the Sales Star Dimensions Product Tab Delimited Files NorthwindOLTP DTS time_dim customer_dim product_dim SQL Server Stored Procedure DTS
24
Populating the Sales Star Fact Table DTS sales_fact DTS sales_stage time_dimcustomer_dim product_dimsales_stage Sales Data File
25
Designing Modular Packages Creating Modular Packages Simplify complex workflows Create more readable packages Produce smaller packages that are easier to debug Using Outer Packages Execute multiple packages within a single package Combine modular packages into logical workflows Reuse modular packages in different workflows Execute packages in parallel
26
Using DTS to Populate the Sales Star
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.