Presentation is loading. Please wait.

Presentation is loading. Please wait.

Programming Patterns with BISM Tabular

Similar presentations


Presentation on theme: "Programming Patterns with BISM Tabular"— Presentation transcript:

1 Programming Patterns with BISM Tabular
Alberto Ferrari Senior Consultant at SQLBI.COM

2 Who’s Speaking? Independent consultant Fond of BI… I love it!
Book author Founder of SQLBI.COM

3 Traditional Cube Development
OLTP DataWarehouse DataMart OLAP Cube MDX Script MDX Queries Excel

4 User Requirements Change…
I need a simple report I want to analyze Sales Now I need Time Intelligence Well, how many new customers each month? Oh, am I losing or gaining market share? Listen… are my customers switching behavior?

5 Model Complexity Increasing
Custom Modeling Data Model Complexity Dimensional Modeling User Requirements

6 Why Dimensional Modeling?
Users understand it Fast data model Consumed by Excel Custom Dimensional Modeling?

7 Vertipaq: a new kid in town
OLAP BISM Multidimensional Dimensional Modeling Facts, Dimensions Complex Relationships MDX Script Powerful, complex Vertipaq BISM Tabular Relational Modeling Tables Basic Relationships (1:N) DAX Calculated Columns Measures

8 Vertipaq Simpler data model Less dimensional tools Amazingly fast
Relational modeling Speed  Different Modeling Options?

9 Tabular Data Models… this way?
OLTP DataWarehouse DataMart Vertipaq Database DAX Excel

10 Or this way? OLTP DataWarehouse Vertipaq Database DAX Excel

11 NO DEMOS Agenda Scenarios Think in DAX, forget dimensional modeling
Warehouse Stock Analysis Customer Analysis Transition Matrix Pareto / ABC Analysis Think in DAX, forget dimensional modeling NO DEMOS

12 Warehouse Stock Analysis
Calculate Warehouse Availability from transactions Warehouse Stock Analysis

13 Warehouse Data Model

14 MDX Query WITH MEMBER MEASURES.Stock AS SUM ( NULL : [Date Order].[Calendar].CURRENTMEMBER, [Measures].[Quantity] ) SELECT Stock ON 0, NON EMPTY [Product].[Product].[Product].MEMBERS * [Date Order].[Calendar].[Month].MEMBERS ON 1 FROM [Movements]

15 Snapshot Table

16 SQL Query for Snapshot SELECT t.TimeKey, t.FullDateAlternateKey, s1.*
FROM dbo.DimTime t CROSS APPLY (SELECT s.ProductKey, SUM (s.Quantity) AS Stock dbo.FactMovements s WHERE s.OrderDateKey <= t.TimeKey GROUP BY s.ProductKey) s1 DayNumberOfMonth = 1

17 SQL Query Plan Each date  Full Fact Table Scan

18 DAX Query Stock := CALCULATE( SUM (FactMovements[OrderQuantity]), FILTER( ALL (DimTime), DimTime[TimeKey] <= MAX(DimTime[TimeKey]) )

19 Vertipaq Solution No Snapshot Table Leverages Only Movements
Amazingly Fast Speed  Different Modeling Options

20 Count new and lost customers
Customer Analysis

21 Customer Analysis

22 Customer Analysis How many new customers we had this month?
Dimensional model Complex and slow query (either SQL or MDX) Slowness is caused by looking for the lack of an event in a dimensional model Optimization: monthly snapshot of new customers Possible shortcut: saving date of first and last sale for each customer Extraction logic embedded in ETL – not flexible Show possible queries in MDX and SQL - 01 – Customer Retention

23 Snapshot Required

24 Customer Analysis NOTE: in this scenario, UPDATE PROCESS on dimension is required

25 MDX Query WITH MEMBER MEASURES.TotalCustomers AS AGGREGATE( NULL : [Date].[Calendar].CURRENTMEMBER, [Measures].[Customer Count]) MEMBER MEASURES.NewCustomers AS MEASURES.TotalCustomers - AGGREGATE( NULL : [Date].[Calendar].PREVMEMBER, MEMBER MEASURES.ReturningCustomers AS MEASURES.[Customer Count] - MEASURES.NewCustomers SELECT { [Measures].[NewCustomers], [Measures].[Customer Count], [Measures].[ReturningCustomers] } ON COLUMNS, Date.Calendar.Month.MEMBERS ON ROWS FROM [Adventure Works]

26 DISTINCT COUNT Measures…
MDX Query WITH MEMBER MEASURES.TotalCustomers AS AGGREGATE( NULL : [Date].[Calendar].CURRENTMEMBER, [Measures].[Customer Count]) MEMBER MEASURES.NewCustomers AS MEASURES.TotalCustomers - AGGREGATE( NULL : [Date].[Calendar].PREVMEMBER, MEMBER MEASURES.ReturningCustomers AS MEASURES.[Customer Count] - MEASURES.NewCustomers SELECT { [Measures].[NewCustomers], [Measures].[Customer Count], [Measures].[ReturningCustomers] } ON COLUMNS, Date.Calendar.Month.MEMBERS ON ROWS FROM [Adventure Works] DISTINCT COUNT Measures…

27 Distinct Count in OLAP Distinct Count is very expensive
Separate measure group Expensive processing (ORDER BY required) Different partitioning for query optimization

28 Vertipaq Solution BISM Tabular Query in DAX is fast enough
Extraction logic in DAX, not in ETL No special ETL and data modeling required Show demo in PowerPivot/Tabular

29 DAX Formula NewCustomers := CALCULATE( DISTINCTCOUNT (FactInternetSales[CustomerKey]), FILTER ( ALL (DimTime), DimTime[TimeKey] <= MAX (DimTime[TimeKey]) ) ) - CALCULATE( DISTINCTCOUNT (FactInternetSales[CustomerKey]), FILTER( ALL (DimTime), DimTime[TimeKey] < MIN (DimTime[TimeKey]) ) )

30 Vertipaq Solution Simpler data model Very fast Simpler formulas
Speed  Different Modeling Options

31 Transition of an attribute over time in SCD2
Transition Matrix

32 Relational Data Model

33 Transition Matrix How many customers moved Complex SQL query
From Rating AAA To rating AAB In the last period / month? Complex SQL query Complex OLAP model Show the query in SQL that solves the question Show the Multidimensional model that simplify the query and makes the model accessible to an end user (but the multidimensional model is complex and harder to maintain)

34 Data Model for OLAP Customers SCD 1 Rating Snapshot SCD 2

35 The DSV Data Model

36 SSAS Dimension Usage 2 Dimensions and 2 Measure Groups for Rating A and B Complex Dimension Usage

37 The Final Result in OLAP
Start Rating Start Month (January) Number of Customers End Month (Jan-June)

38 Transition Matrix in Vertipaq
Simpler data model Facts table are not duplicated Complexity in DAX measures Easier to adapt to existing relational models Calculated columns to avoid snapshot tables TODO: verify whether a scenario with just a measure is possible or not (it might be very slow, anyway)

39 Transition Matrix – Tabular
Dim_Customer is still SCD Type 2 Rating_Snapshot is a snapshot of ratings measured monthly

40 Transition Matrix - Tabular
NumOfCustomers := CALCULATE ( DISTINCTCOUNT (Dim_Customers[Customer]), FILTER ( Dim_Customers, CALCULATE ( COUNTROWS (RatingSnapshot), RatingSnapshot[CustomerSnapshot] = EARLIER (Dim_Customers[Customer]) ) > && (Dim_Customers[scdStartDate] <= MAX (Dim_Date[ID_Date]) ) && ( Dim_Customers[scdEndDate] > MIN (Dim_Date[ID_Date]) || ISBLANK (Dim_Customers[scdEndDate]) ) ) )

41 Vertipaq Solution Much simpler data model Very fast
Formulas slightly more complex Speed  Different Modeling Options

42 From two loops to single calculation
ABC Classification

43 Pareto Principle 80% of effects comes from 20% of the causes
In other words 80% of revenues comes from 20% of customers 20% of products Explain the pareto principle and how this applies to any kind of computation. It is very useful, yet tricky to define.

44 ABC Analysis ABC Analysis A items for 70% of total
B items for 20% of total C items for 10% of total

45 ABC Analysis in Excel =IF( H6<=0,7; "A"; IF( H6<=0,9; "B"; "C“ ) )

46 ABC Used to Slice Data

47 ABC Possible Solutions
Heavy Calculation in ETL Could be very long in SQL Faster in MDX Requires double cube processing to store ABC value in a dimension attribute Show an example in SQL? No need to show the complete example in MDX, just a block diagram to represent the need of a double cube processing could be enough

48 ABC in SQL SQL SQL SQL SUM( Amount) GROUP BY Product INTO #temp
UPDATE #temp SET Percentage = RunningTotal on Product Amount / SUM( Amount ) UPDATE product SET Class = A/B/C (from #temp) SQL SQL SQL

49 ABC in MDX SQL OLAP OLAP OLAP Process Sales cube
MDX query on Sales cube to extract ABC Class UPDATE product SET Class = A/B/C from MDX query Requires ETL or Linked Server Process Product Dimension Update Sales aggregations (ABC flexible attribute) OLAP OLAP OLAP SQL

50 BISM Tabular Calculated columns Computed during processing
No need for double processing Show implementation in PowerPivot/Tabular based on calculated columns.

51 Intermediate Measures
SalesAmountProduct := CALCULATE( SUM( SalesOrderDetail[LineTotal] ) ) CumulatedProduct := SUMX( FILTER (Product, Product[SalesAmountProduct] >= EARLIER (Product[SalesAmountProduct])), Product[SalesAmountProduct] ) You might use SUMX instead of CALCULATE – it seems simpler, but is slower – it would be required only if a more complex expression was required for each row in SalesOrderDetail SalesAmountProduct = SUMX( RELATEDTABLE( SalesOrderDetail ), SalesOrderDetail[LineTotal] )

52 Class Computation SortedWeightProduct = Product[CumulatedProduct] / SUM( Product[SalesAmountProduct] ) [ABC Class Product] = IF( Product[SortedWeightProduct] <= 0.7, "A", IF( Product[SortedWeightProduct] <= 0.9, "B", "C" ) )

53 Vertipaq Solution No need for ETL Computed at Process Time Very fast
Speed  Different Modeling Options

54 Time to go to an end… Conclusions

55 Data Modeling Considerations
Dimensional Modeling Not the only option Not easy for complex scenarios Vertipaq: game changer Simpler modeling More natural Vertipaq speed leads to simpler data models

56 What Next? Consider simpler models for Vertipaq Learn DAX
Seriously… learn it! BISM Tabular side-by-side with Multidimensional Complex Scenarios with Tabular Think in DAX, it’s not easy…

57 For attending this session and PASS SQLRally Nordic 2011, Stockholm
THANK YOU! For attending this session and PASS SQLRally Nordic 2011, Stockholm


Download ppt "Programming Patterns with BISM Tabular"

Similar presentations


Ads by Google