Populating Data Warehouse Structures Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema.

Slides:



Advertisements
Similar presentations
Advanced SQL Topics Edward Wu.
Advertisements

10 Copyright © 2005, Oracle. All rights reserved. Dimensions.
Data Definition and Integrity Constraints
Information Systems Today: Managing in the Digital World
Module 5: Joining Multiple Tables. Overview Using Aliases for Table Names Combining Data from Multiple Tables Combining Multiple Result Sets.
Copyright © 2008 Pearson Prentice Hall. All rights reserved.1 1 Committed to Shaping the Next Generation of IT Experts. Chapter 2.1: Relational Databases.
Microsoft Access.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
4 Oracle Data Integrator First Project – Simple Transformations: One source, one target 3-1.
BY LECTURER/ AISHA DAWOOD DW Lab # 4 Overview of Extraction, Transformation, and Loading.
TIDE Presentation Florida Standards Assessments 1 FSA Regional Trainings Updated 02/09/15.
Moving Data Lesson 23. Skills Matrix Moving Data When populating tables by inserting data, you will discover that data can come from various sources.
MIS 451 Building Business Intelligence Systems Logical Design (3) – Design Multiple-fact Dimensional Model.
Concepts of Database Management Sixth Edition
Introduction to Structured Query Language (SQL)
Week 5 – Chap. 5 Data Transfer DBAs often must transfer data to and from text files, Excel spreadsheets, Access, Oracle or other SQL Server databases This.
Copying, Managing, and Transforming Data With DTS.
Module 11: Data Transport. Overview Tools and functionality in Oracle and their equivalents in SQL Server for: Data transport out of the database Data.
Module 9: Transferring Data. Overview Introduction to Transferring Data Tools for Importing and Exporting Data in SQL Server Introduction to DTS Transforming.
Data Warehouse.
Leaving a Metadata Trail Chapter 14. Defining Warehouse Metadata Data about warehouse data and processing Vital to the warehouse Used by everyone Metadata.
ETL Design and Development Michael A. Fudge, Jr.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
Page 1 ISMT E-120 Desktop Applications for Managers Introduction to Microsoft Access.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
What’s New in SSIS with SQL 2008 Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
Data Warehouse Chapter 11. Multiple Files Problem Added complexity of multiple source files Start simple Multiple Source files Extracted data Logic to.
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
Data-mining & Data As we used Excel that has capability to analyze data to find important information, the data-mining helps us to extract information.
HDNUG 27-March-2007 SQL Server 2005 Suite as a Business Intelligence Solution.
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Populating a Data Warehouse. Overview Process Overview Methods of Populating a Data Warehouse Tools for Populating a Data Warehouse Populating a Data.
Data Management Console Synonym Editor
Oracle Data Integrator Transformations: Adding More Complexity
Module 1: Introduction to Data Warehousing and OLAP
7 1 Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
Database Management System Prepared by Dr. Ahmed El-Ragal Reviewed & Presented By Mr. Mahmoud Rafeek Alfarra College Of Science & Technology- Khan younis.
More Dimensional Modeling. Facts Types of Fact Design Transactional Periodic Snapshot –Predictable time period –Ex. Monthly, yearly, etc. Accumulating.
Physical Design Michael A. Fudge, Jr.
Chapter 4 Constraints Oracle 10g: SQL. Oracle 10g: SQL 2 Objectives Explain the purpose of constraints in a table Distinguish among PRIMARY KEY, FOREIGN.
Oracle Data Integrator Data Quality (Integrity Control)
6 Copyright © 2009, Oracle. All rights reserved. Using the Data Transformation Operators.
7 Strategies for Extracting, Transforming, and Loading.
Constraints Lesson 8. Skills Matrix Constraints Domain Integrity: A domain refers to a column in a table. Domain integrity includes data types, rules,
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Starting with Oracle SQL Plus. Today in the lab… Connect to SQL Plus – your schema. Set up two tables. Find the tables in the catalog. Insert four rows.
SSMS SQL Server Management System. SQL Server Microsoft SQL Server is a Relational Database Management System (RDBMS) Relational Database Management System.
Creating Simple and Parallel Data Loads With DTS.
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.
SmartClient for Salesforce – Office 365 Excel Solution Walk-through.
Physical Layer of a Repository. March 6, 2009 Agenda – What is a Repository? –What is meant by Physical Layer? –Data Source, Connection Pool, Tables and.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Agenda for Today  DATABASE Definition What is DBMS? Types Of Database Most Popular Primary Database  SQL Definition What is SQL Server? Versions Of SQL.
©NIIT BCP and DTS Implementing Stored Procedures Lesson 2A / Slide 1 of 23 Objectives In this lesson, you will learn to: Perform bulk copy using the BCP.
Introduction to Structured Query Language (SQL) By Techandmate.comTechandmate.com Learn SQL Server With US.
Defining Data Warehouse Structures Data Warehouse Data Access End User Data Access Data Sources Staging Area Data Marts Data Extract, Transform, and Load.
Managing, Storing, and Executing DTS Packages
Populating a Data Warehouse
Populating a Data Warehouse
Populating a Data Warehouse
Populating a Data Warehouse
Populating a Data Warehouse
Assignment 2 Due Thursday Feb 9, 2006
Best Practices in Higher Education Student Data Warehousing Forum
Implementing ETL solution for Incremental Data Load in Microsoft SQL Server Ganesh Lohani SR. Data Analyst Lockheed Martin
Presentation transcript:

Populating Data Warehouse Structures

Examining the Star Schema Dimension Tables Dimension Table Fact Table Sales Star Schema

Implementing the Star Schema 1. Extract Data From Multiple Sources 2. Integrate, Transform, and Restructure Data 3. Load Data Into Dimension Tables and Fact Tables

The Star Schema Data Load NorthwindOLTP Staging Area Polaris Data Warehouse Heterogeneous Data Sources ExternalFiles ExternalFiles InternalFiles InventoryStar SalesStar Extracting Data From Transforming Loading the Heterogeneous Sources Data Star Schema DTS Financial

Verifying the Dimension Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values

Dimension Data Load Examples:buyer_namebuyer_name Barr, Adam Chai, Sean OMelia, Erin... reg_idreg_id buyer_firstbuyer_first Adam Sean Erin... buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id DTS buyer_codebuyer_code A123 B buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id buyer_codebuyer_code U999 A123 B buyer_lastbuyer_last Barr Chai OMelia... reg_idreg_id buyer_namebuyer_name Barr, Adam Chai, Sean Smith, Jane Paper, Anne reg_idreg_id DTS buyer_namebuyer_name Barr, Adam Chai, Sean reg_idreg_id II IV buyer_namebuyer_name Smith, Jane Paper, Anne reg_idreg_id

Maintaining Integrity of the Dimension Assigning a Surrogate Key to Each Record Defines the dimensions primary key Relates to the foreign key fields of the fact table Loading One Record Per Application Key Maintains uniqueness in the dimension Depends on how you manage changing dimension data Maintains integrity of the fact table

Managing Changing Dimension Data Dimensions with Changing Column Values Inserts of new data Updates of existing data Slowly-Changing Dimension Design Solutions Type 1: Overwrite the dimension record Type 2: Write another dimension record Type 3: Add attributes to the dimension record

Type 1: Overwriting the Dimension Slide Existing record is changed product key product name product size product package product dept product cat product subcat... product key product name product size product package product dept product cat product subcat... Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks... Before After 001 Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks oz.

Type 2: Writing Another Dimension Record Adds a new record product key product name product size product package product dept product cat product subcat effective_date … product key product name product size product package product dept product cat product subcat effective_date … Product Dimension 001 Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Rice Puffs 10 oz. Bag Grocery Dry Goods Snacks Before After 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks oz. 12 oz. Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks Rice Puffs 12 Oz Bag Grocery Dry Goods Snacks

Type 3: Adding Attributes in the Dimension Record Additional information is stored in an existing record Product Dimension product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product key product name product size product package product dept product cat product subcat current product size date previous product size previous product size date 2nd previous product size 2nd previous product size date... product size previous product size previous product size date Before 001 Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Oz (null) Rice Puffs 10 Oz Bag Grocery Dry Goods Snacks Oz (null) oz. 11 oz After 001 Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks oz Oz Rice Puffs 12 oz. Bag Grocery Dry Goods Snacks oz Oz oz oz

Verifying the Fact Table Source Data Verifying Accuracy of Source Data Integrating data from multiple sources Applying business rules Checking structural requirements Managing Invalid Data Rejecting invalid data Saving invalid data to a log Correcting Invalid Data Transforming data Reassigning data values

Assigning Foreign Keys Dimension Tables customer_dimcustomer_dim 201 ALFI Alfreds product_dimproduct_dim Chai Source Data customer id ALFI1231/1/ /1/2000 time_dimtime_dim product id order date quantity_sales amount_sales 10, /1/ ,789 cust_key 1231/1/ prod_key time_key quantity_sales amount_sales , Sales Fact Data

Defining Measures Loading Measures from the Source System Calculating Additional Measures Source System Data Fact Table Datacustomer_idcustomer_id VINET ALFI HANAR... product_idproduct_id 9GZ 1KJ 0ZA... priceprice qtyqty customer_keycustomer_key product_keyproduct_key qtyqty total_salestotal_sales

Maintaining Data Integrity Adhering to the Fact Table Grain A fact table can only have one grain You must load a fact table with data at the same level of detail as defined by the grain Enforcing Column Constraints NOT NULL constraints FOREIGN KEY constraints

Implementing Staging Tables Centralize and Integrate Source Data Break Up Complex Data Transformations Facilitate Error Recovery Staging Area sales_stage inventory_stage market_stage shipments_stage

DTS Functionality Accessing Heterogeneous Data Sources Importing, Exporting, and Transforming Data Creating Reusable Transformations and Functions Automating Data Loads Managing Metadata Customizing and Extending Functionality

Defining DTS Packages Identifies Data Sources and Destinations Defines Tasks or Actions Implements Transformation Logic Defines Order of Operations

Identifying Package Components Connections Access Data Sources and Destinations Tasks Describe Data Transformations or Functions Steps Define the Order of Task Operations or Workflow Global Variables Store Data that Can Be Shared Across Tasks

Creating Packages Using the DTS Import / Export Wizard Perform ad-hoc table and data transfers Develop a prototype package Using DTS Package Designer Edit packages created with the DTS Import/Export Wizard Create packages with a wide range of functionality Programming DTS Applications Directly access the functionality of the DTS Object Model Requires Microsoft Visual Basic or Microsoft Visual C++

Using DTS to Populate the Sales Star Populating the Sales Star Dimensions Populating the Sales Star Fact Table

Populating the Sales Star Dimensions Product Tab Delimited Files NorthwindOLTP DTS time_dim customer_dim product_dim SQL Server Stored Procedure DTS

Populating the Sales Star Fact Table DTS sales_fact DTS sales_stage time_dimcustomer_dim product_dimsales_stage Sales Data File

Designing Modular Packages Creating Modular Packages Simplify complex workflows Create more readable packages Produce smaller packages that are easier to debug Using Outer Packages Execute multiple packages within a single package Combine modular packages into logical workflows Reuse modular packages in different workflows Execute packages in parallel

Using DTS to Populate the Sales Star