Download presentation
Presentation is loading. Please wait.
Published byAnnis Silvia Hopkins Modified over 9 years ago
1
Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004 Brian.Thomas@SolutionBuilders.com
2
Data collected from one or many systems that exist within and outside the organization. The Data is structured in such a way as to reduce the amount of time that it takes to produce reliable information. What is a Data Warehouse?
3
Why Build a Data Warehouse? To Provide a Consistent Common Source for Corporate Information To Store Large Volumes of Historical Detail Data from Mission Critical Applications Improve the Ability to Access, Report Against, and Analyze Information To Solve or Improve Upon Business Processes
4
Turning Data into Information Sales System System Generated Reports Sales Analysis is extrapolated from the System Reports. Functional Data Warehouse
5
Turning Data into Information Functional Data Warehouse of Sales Information Sales Information is available to a wider audience of decision makers. Sales System Functional Data Warehouse
6
Turning Data into Information Sales System Division A Division B Sales System Division C Centralized Data Warehouse of Sales Data from across the Organization Analysis performed and Decisions drawn from the Cross Organizational Sales Data Cross Organizational Functional Data Warehouse
7
Turning Data into Information Sales System Production Systems Marketing System System Generated Reports Corporate Performance Analysis is extrapolated from the System Reports. Cross Functional Data Warehouse
8
Turning Data into Information Sales System Production Systems Marketing System Cross Functional Data Warehouse of Information Corporate Performance Analysis is available to a wider audience. Cross Functional Data Warehouse
9
Turning Data into Information Division A Division B Division C Centralized Cross Functional Data Warehouse of Information Analysis is performed and Decisions made from the Cross Functional Organizational Performance Data Cross Organizational & Cross Functional Data Warehouse
10
Source SystemsData Warehouse Components Access Methods Extraction Transformation Load (ETL) Corporate Level Business Group Level Divisional Level Enterprise Data Warehouse Increased Level of Standardization Increased Local Specifications DW / DM DM DW / DM Data Access & Query Management Services Planning & Forecasting Performance Management Scorecards & Dashboards Analytics & Modeling Query & Reporting Portal / Web Interface Desktop Applications Printed Reports Email Mobile Devices Division A Division B Division C External Data Data Warehouse Architecture Management Systems
11
Data Warehouse Architecture Source Systems Division A Division B Division C External Data Data Staging Area Data Warehouse Repository Extract, Transformation and Load (ETL)
12
Data Warehouse Architecture Data Staging Area Subject Area Oriented Data Structure more closely mirrors Operational System Data Layouts Supports Identification of Changed Data Acts as a Working Area to Support the Transformation Process
13
Data Warehouse Architecture Extraction, Transformation & Load (ETL) Extract, Transformation and Load (ETL) Perform Attribute Standardization and Cleansing Apply Business Rules and Calculations Consolidate using Matching and Merge / Purge Logic Ensure Proper Linking and Tracking of History
14
Data Warehouse Architecture Extraction, Transformation & Load (ETL) App. A: Male, Female App. B: 1, 0 App. C: x, y App. D: m, f App. A: pipeline (cm) App. B: pipeline (inches) App. C: pipeline (mcf) App. D: pipeline (yds) App. A: Date (julian) App. B: Date (yyyymmdd) App. C: Date (mm/dd/yyyy) App. D: Date (absolute) App. A: Description App. B: Description App. C: Description App. D: Description App. A: balance on hand App. B: current balance App. C: cash in house App. D: balance Male, Female pipeline (cm) Date (julian) Description Balance Lookup Function Conversion Function Formatting Function Merging Function Mapping Function
15
Data Warehouse Architecture Data Warehouse Repository Organized around Conformed Dimensions and Facts Promotes Usability and Intuitiveness Consolidated and Cross-Functional Historical and Atomic Representation of Data Insulated from Source System Modifications and Additions
16
Data Warehouse Repository Star Schema Concepts Fact Table This table is the core of the Star Schema Structure and contains the Facts or Measures available through the Data Warehouse. These Facts answer the questions of “What”, “How Much”, or “How Many”. Some Examples: Sales Dollars, Units Sold, Gross Profit, Expense Amount, Net Income, Unit Cost, Number of Employees, Turnover, Salary, Tenure, etc.
17
Dimension Tables Data Warehouse Repository Star Schema Concepts These tables describe the Facts or Measures. These tables contain the Attributes and may also be Hierarchical. These Dimensions answer the questions of “Who”, “What”, “When”, or “Where”. Some Examples: Day, Week, Month, Quarter, Year Sales Person, Sales Manager, VP of Sales Product, Product Category, Product Line Cost Center, Unit, Segment, Business, Company
18
Data Warehouse Repository Star Schema Concepts Time_DimTime_Dim TimeKey TheDate. TheDate. Sales_Fact TimeKey EmployeeKey ProductKey CustomerKey ShipperKey TimeKey EmployeeKey ProductKey CustomerKey ShipperKey Required Data (Business Metrics) or (Measures). Required Data (Business Metrics) or (Measures). Employee_DimEmployee_Dim EmployeeKey EmployeeID. EmployeeID. Product_DimProduct_Dim ProductKey ProductID. ProductID. Customer_DimCustomer_Dim CustomerKey CustomerID. CustomerID. Shipper_DimShipper_Dim ShipperKey ShipperID. ShipperID.
19
Apples Cherries Grapes Melons Q4 Q1Q2Q3 Time Dimension Dallas Denver Chicago Markets Dimension Atlanta Product Dimension Data Warehouse Repository Cube Concepts
20
Q4 Data Warehouse Repository Cube Concepts Cherries Grapes Melons Q1Q2Q3 Time Dimension Dallas Denver Chicago Markets Dimension Atlanta Product Dimension Sales Fact Apples
21
Data Warehouse Repository Storage Concepts Relational On-Line Analytical Processing (ROLAP): The information that is stored in the Data Warehouse is held in a relational structure. Aggregations are performed on the fly either by the database or in the analysis tool. Multidimensional On-Line Analytical Processing (MOLAP): This information is aggregated in a predefined manner based on the characteristics of the Measures and the defined hierarchy of the Dimensions. Since the data is pre- aggregated, navigating through the hierarchies is instantaneous. The user is simply navigating to a point within the Multidimensional Cube and not performing any on the fly aggregations. Hybrid On-Line Analytical Processing (HOLAP): This is a combination of MOLAP and ROLAP. A portion of the data is predefined and aggregated. This would typically be the set of information that is accessed most frequently. Additional detail can be held in a ROLAP structure and allow a user to drill through the MOLAP structure into the ROLAP structure.
22
Client perspective MOLAPMOLAPHOLAPHOLAPROLAPROLAP Query performance Storage consumption Fastest High Faster Medium Fast Low Data Warehouse Repository Cube Concepts
23
Source SystemsData Warehouse Components Access Methods Extraction Transformation Load (ETL) Corporate Level Business Group Level Divisional Level Enterprise Data Warehouse Increased Level of Standardization Increased Local Specifications DW / DM DM DW / DM Data Access & Query Management Services Planning & Forecasting Performance Management Scorecards & Dashboards Analytics & Modeling Query & Reporting Portal / Web Interface Desktop Applications Printed Reports Email Mobile Devices Division A Division B Division C External Data Management Systems Where does Microsoft fit in? SQL Server DTS SQL Server Relational Database and Analysis Services SQL Stored Procedures, SQL Views, MDX, and.NET Web Services Microsoft Office, Reporting Services and.NET Framework SharePoint Portal, Exchange, and.NET Framework
24
Q & A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.