ISQS 6339, Data Management and Business Intelligence Cubism – Measures and Dimensions Zhangxi Lin Texas Tech University 1.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Module 8 Importing and Exporting Data. Module Overview Transferring Data To/From SQL Server Importing & Exporting Table Data Inserting Data in Bulk.
Introduction to ETL Using Microsoft Tools By Dr. Gabriel.
SQL Server Accelerator for Business Intelligence (SSABI)
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Technical BI Project Lifecycle
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Building a Data Warehouse with SQL Server Presented by John Sterrett.
Copying, Managing, and Transforming Data With DTS.
Agenda Common terms used in the software of data warehousing and what they mean. Difference between a database and a data warehouse - the difference in.
ISQS 3358, Business Intelligence Extraction, Transformation, and Loading Zhangxi Lin Texas Tech University 1.
ISQS 3358, Business Intelligence Creating Data Marts Zhangxi Lin Texas Tech University 1.
ISQS 6339, Business Intelligence Creating Data Marts
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
What’s New in SSIS with SQL 2008 Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
GIS Concepts ‣ What is a table? What is a table? ‣ Queries on tables Queries on tables ‣ Joining and relating tables Joining and relating tables ‣ Summary.
IST722 Data Warehousing Business Intelligence Development with SQL Server Analysis Services and Excel 2013 Michael A. Fudge, Jr.
Analysis Services 101 Dave Fackler, MCDBA, MCSE, MCT Director, Business Intelligence Practice Intellinet Corporation.
Performance Tuning Cubes and Queries in Analysis Services 2008 Chris Webb
IMS 6217: Data Warehousing / Business Intelligence Part 3 1 Dr. Lawrence West, Management Dept., University of Central Florida Analysis.
SQL Server Reporting Services for Application Developers – Attendees pick topics Kevin S. Goff.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
Vidas Matelis, Toronto SQL Server User Group November 13, 2008.
Objects for Business Reporting MIS 497. Objective Learn about miscellaneous objects required for business reporting. Learn about miscellaneous objects.
Session 4: The HANA Curriculum and Demos Dr. Bjarne Berg Associate professor Computer Science Lenoir-Rhyne University.
Dimensional model. What do we know so far about … FACTS? “What is the process measuring?” Fact types:  Numeric Additive Semi-additive Non-additive (avg,
Data Warehouse and Business Intelligence Dr. Minder Chen Fall 2009.
ISQS 6339, Data Management and Business Intelligence Cubism – Bells and Whistles Zhangxi Lin Texas Tech University 1.
Data Management Console Synonym Editor
Building the cube – Chapter 9 & 10 Let’s be over with it.
Microsoft Access Designing and creating tables and populating data.
BI Terminologies.
INVENTORY CASE STUDY. Introduction Optimized inventory levels in stores can have a major impact on chain profitability: minimize out-of-stocks reduce.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
ISQS 3358, Business Intelligence Cubism – Measures and Dimensions Zhangxi Lin Texas Tech University 1.
ISQS 3358, Business Intelligence Supplemental Notes on the Term Project Zhangxi Lin Texas Tech University 1.
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Zhangxi Lin Texas Tech University ISQS 6347, Data & Text Mining 1 ISQS 6339 Data Management and Business Intelligence Database Review.
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
BI Practice March-2006 COGNOS 8BI TOOLS COGNOS 8 Framework Manager TATA CONSULTANCY SERVICES SEEPZ, Mumbai.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
SSIS – Deep Dive Praveen Srivatsa Director, Asthrasoft Consulting Microsoft Regional Director | MVP.
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.
ISQS 3358, Business Intelligence Creating Data Marts Zhangxi Lin Texas Tech University 1.
ISQS 3358, Business Intelligence Extraction, Transformation, and Loading Zhangxi Lin Texas Tech University 1.
Or How I Learned to Love the Cube…. Alexander P. Nykolaiszyn BLOG:
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Copyright 2015 Varigence, Inc. Unit and Integration Testing in SSIS A New Approach Scott @varigence.
Advanced Analysis Services Security Chris Webb Crossjoin Consulting Limited.
Building the Corporate Data Warehouse Pindaro Demertzoglou Lally School of Management Data Resource Management.
Practical MSBI(SSIS, SSAS,SSRS) online training. Contact Us: Call: Visit:
Microsoft BI Online Training AcuteSoft: India: , Land Line: +91 (0) USA: , UK.
Extending and Creating Dynamics AX OLAP Cubes
Building a Polished Cube
Zhangxi Lin Texas Tech University
Data Warehousing/Loading the DW—Topics
Module III: Business Analytics
Introduction to SQL Server Analysis Services
Zhangxi Lin Texas Tech University
Zhangxi Lin Texas Tech University
Databases and Data Warehouses Chapter 3
Enhance BI Applications and Simplify Development
Dimensional Model January 16, 2003
SQL-Data Definition 4/21/2019.
Review of Major Points Star schema Slowly changing dimensions Keys
Analysis Services Analysis Services vs. the Data Warehouse vs. OLTP DB
Data Warehousing/Loading the DW—Topics
Presentation transcript:

ISQS 6339, Data Management and Business Intelligence Cubism – Measures and Dimensions Zhangxi Lin Texas Tech University 1

Outline Measures Where we’ve been Populating fact table Types of dimensions 2

Structure and Components of Business Intelligence 3 SSMS SSIS SSAS SSRS SAS EM SAS EM SAS EG SAS EG

Snowflake Schema of the Data Mart 4 Manufacturingfact DimProduct DimProductSubType DimProductType DimBatch DimMachine DimMachineType DimMaterial DimPlant DimCountry

Where we’ve been and where we are now Exercise 1: Getting started Exercise 2: Creating a data mart with SSMS Exercise 3: Creating data mart with BIDS Exercise 4: Populating dimensions of a data mart Exercise 5: Loading fact tables Exercise 6: Create and customize a cube 5

What we need to do with the half-done data mart? Populate DimBatch dimenstion table Populate ManufacturingFact table Build an OLAP cube (we already did this before) Check measures Check dimensions 6

7 MEASURES

Facts Facts are measurements associated with a specific business process. Many facts can be derived from other facts, including additive and semiadditive facts. Non-additive facts can be avoided by calculating it from additive facts. Measures are clustered together in a group, called measure group. 8

Types of measures Three types ◦ Additive measures. Most facts are additive (calculative), such as sum ◦ Semiadditive measures. The measures that can be added along some dimensions, but not along others. For example, inventory level can be added along product dimension but not time dimension. ◦ Non-additive (such as max, average), or descriptive (e.g. factless fact table). Aggregate functions ◦ Additive: Sum ◦ Semiadditive: ByAccount, Count, FirstChild, FirstNonEmpty, LastChild, LastNonEmpty, Max, Min ◦ Nonadditive: DistinctCount, None.

Measures and dimensions Dimensions are used to aggregate measures. Therefore, they must be somehow related to measures Granularity ◦ Important for the analysis ◦ There could be missing values in the fact table

LOADING FACT TABLES 11

Exercise 5: Loading Fact Tables Project name: MMMFactLoad-lastname Package name: FactLoad.dtsx Tasks ◦ Create Inventory Fact table ◦ Load Dim Batch ◦ Load Manufacturing Fact ◦ Load Inventory Fact Deliverable: a screenshot of the “green” outcome of the ETL project to with a subject title “ISQS 6339 EX5 - ” 12

Inventory Fact Table Create a Table InventoryFact in your database. ◦ Compound primary key: DateOfInventory, ProductCode, and Material ◦ Define two foreign keys Column NameData TypeAllow Nulls InventoryLevelIntNo NumberOnBackorderIntNo DateOfInventoryDatatimeNo ProductCodeIntNo MaterialVarchar(30)No 13

Data Sources for Loading Fact For loading DimBatch table and ManufacturingFact table ◦ BatchInfo.CSV For loading InventortyFact table ◦ Lin.OrderProcessingSystem Database 14

Control Flow for Loading Facts and the Remaining Dimension Note: to ease debugging, you may use three packages and test them one by one, instead of doing everything in one package 15

Flat File Connection Data types ◦ BatchNumber, MachinNumber: four-byte signed integer [DT_I4] ◦ ProductCode, NumberProduced, NumberRejected: four-byte signed integer [DT_I4] ◦ TimeStarted, TimeStopped: database timestamp [DT_DBTimeStamp] Only check BatchNumber as the input of Dim Batch All columns are needed for fact tables 16

Some Frequently Used Nodes

Load DimBatch Data Flow 18

Load DimBatch Data Flow 19 Note: Because of duplication in the source file, we may insert An Aggregate item after the Flat File Source item.

The Flat File Source 20

21 Sort Transformation In the Aggregate item, Define “Group-by” BatchNumber. In Derived column item, Define BatchName From BatchNumber Use the expression (DT_WSTR, 50)[BatchNumber] To change the data type Of BatchName.

Load Fact Data Flow 22

Derived Columns for the Fact table 23

Expressions for the Derived Columns AcceptedProducts ◦ [NumberProduced] – [NumberRejected] ElapsedTimeForManufacture ◦ DATEDIFF(“mi”, [TimeStarted],[TimeStopped]) DateOfManufacture ◦ (DT_DBTIMESTAMP)SUBSTRING((DT_WSTR,25)[TimeS tarted],1,10)  This expression converts TimeStarted into a string and selects the first ten characters of that string. This string is then converted back into a date time, without the time portion. 24

25 OLE DB Destination For loading the fact table

Load Inventory Fact OLE DB Source ◦ OrderProcessingSystem.InventoryFact OLE DB Destination ◦ MaxMinManufacturingDM-lastname.InventoryFact No transformation There are two ways to loading the table ◦ Create the table and use ETL to load it ◦ Import directly from the source to the database MaxMinManufacturingDM-lastname 26

Debugging Results 27 Loading DimBatch Loading ManufacturingFact

28 BUILDING AN OLAP CUBE

Exercise 6: Design a Cube Project name: ISQS6339_EX6_2015_lastname Tasks ◦ Add in new date items (year, quarter, and month) to two fact tables ◦ Create time dimension using Manufacturing Fact table ◦ Define calculated measures (Total Products, Percent Rejected) ◦ Define hierarchies of attributes in dimension tables ◦ Create a cube from the MaxMinManufacturing data mart with hierarchical date dimension Deliverable: ◦ Screenshots: dimension hierarchies, dimensions, relationships of facts and dimensions, deployment result, format of measures, and browsing results. 29

Three Steps to Create a Cube from Data Sources Defining data source Defining data source view ◦ Add in three new columns of year, quarter, and month for the two fact tables Building a cube. ◦ Define a new dimension Dim Time from Manufacturing Fact table Customize the cube: ◦ Link two fact tables in a cube ◦ Define new primary key for Dim Time ◦ Define calculated measures ◦ Relate dimensions to measures 30

T-SQL Expressions for DS View Definition - Manufacture YearOfManufacture CONVERT(char(4),YEAR(DateOfManufacture)) QuarterOfManufacture CONVERT(char(4), YEAR(DateOfManufacture)) + CASE WHEN MONTH (DateOfManufacture) BETWEEN 1 AND 3 THEN 'Q1' WHEN MONTH (DateOfManufacture) BETWEEN 4 AND 6 THEN 'Q2' WHEN MONTH (DateOfManufacture) BETWEEN 7 AND 9 THEN 'Q3' ELSE 'Q4' END MonthOfManufacture CONVERT(char(4), YEAR(DateOfManufacture)) + RIGHT('0'+CONVERT(varchar(2), MONTH(DateOfManufacture)),2) 31

T-SQL Expressions for DS View Definition - Inventory YearOfInventory CONVERT(char(4),YEAR(DateOfInventory)) QuarterOfInventory CONVERT(char(4), YEAR(DateOfInventory)) + CASE WHEN MONTH (DateOfInventory) BETWEEN 1 AND 3 THEN 'Q1' WHEN MONTH (DateOfInventory) BETWEEN 4 AND 6 THEN 'Q2' WHEN MONTH (DateOfInventory) BETWEEN 7 AND 9 THEN 'Q3' ELSE 'Q4' END MonthOfInventory CONVERT(char(4), YEAR(DateOfInventory)) + RIGHT('0'+CONVERT(varchar(2), MONTH(DateOfInventory)),2) 32

Data Source View 33 New columns

Select Measures Page 34 Uncheck Manufacture Fact Count

35 The finished cube

36 Cube Structure

37 Defining a format string

38 Inventory measures “Number on Backorder” is also set with these two parameters

Calculated measures – made-up facts The definition of calculated measure is stored in the OLAP cube itself. The actual values that result from a calculated measure are not calculated, however, until a query containing that calculated measure is executed. The results of that calculation are then cached in the cube. The cached value is then delivered to any subsequent users requesting the same calculation. The expressions of calculation are created using a language known as Multidimensional Expression Language (MDX) script. MDX is different from T-SQL. It is a special language with features designed to handle the advanced mathematics and formulas required by OLAP analysis. This is not found in T-SQL. 39

41

42 DIMENSIONS in SQL Server

Types of Dimensions Fact dimensions: the Dimensions created from attributes in a fact table Parent-Child dimensions: Built on a table containing a self- referential relationship, such as a parent attribute. Role playing dimensions: related to the same measure group multiple times; each relationship represents a different role the dimension play; for example, time dimension plays three different roles: date of sale, data of shipment, and date of payment. ◦ To create a role playing dimension, add the dimension to the Dimension Usage tab multiple times. Then create a relationship between each instance of the dimension and the measure group. Reference dimensions: Not related directly to the measure group but to another regular dimension which in turn related to the measure group Data mining dimensions: the information discovered by data mining Many-to-many dimensions: e.g. multiple ship to addresses Slowly changing dimensions 43

Slowly changing dimensions Type 1 SCD – no track Type 2 SCD – tracking the entire history, adding four attributes: SCD Original ID, SCD Start Date, SCD End Date, SCD Status Type 3 SCD – Similar to Type 2 SCD but only track current state and the original state; two additional attribute: SCD Start Date, SCD Initial Value

Add a time dimension (a fact dimension)

Rename time dimension

Date Hierarchy

Material Hierarchy & Plant Hierarchy

Product Hierarchy

Relating Dimensions in the Cube