Optimizing Time-Series Calculations in SSAS 4/23/2017 5:29 AM Optimizing Time-Series Calculations in SSAS Nauzad Kapadia Principal Consultant Quartz Systems nauzadk@quartzsystems.com T: @nauzadk © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Session Objectives and Takeaways 4/23/2017 Session Objectives and Takeaways Review existing approaches to time-series analysis in SSAS Calculated measures Time intelligence wizard Custom built time utility dimensions Understand issues with existing approaches Calculate measure explosion No ability to account for comprehensive time dimensions Weak/non-existent error handling capabilities Introduce a “new” approach that addresses existing issues and provides end-users with flexibility, simplicity, and good performance © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Time-Series Analysis The 2 types of calculations we will focus on today Period-To-Period Also called Prior-Period calculations Use the MDX function .PrevMember WITH MEMBER [Measures].[GDP_Change] AS [Measures].[GDP Billions Current USD] - ( [Measures].[GDP Billions Current USD], [DimDate].[H- Decade].CurrentMember.PrevMember ) SELECT {[Measures].[GDP Billions Current USD], [Measures].[GDP_Change] } ON COLUMNS , NON EMPTY Hierarchize( {[DimDate].[H-Decade].[Decade Desc].Members, [DimDate].[H-Decade].[Calendar Year].Members } ) ON ROWS FROM [GovtDebtAnalysis] WHERE ([DimCountry].[Country].[United States]) ;
Time-Series Analysis The 2 types of calculations we will focus on today Same-Period-Last-Year Also called Parallel-Period or Year Ago Use the MDX function .ParallelPeriod Can alternatively use the .Lag function on dimension attributes WITH MEMBER [Measures].[UnemploymentAvg_PP_Change] AS ([Measures].[UnemploymentAvg] ) - ( [Measures].[UnemploymentAvg], ParallelPeriod( [DimDate].[H-Year].[Calendar Year], 1, [DimDate].[H-Year].CurrentMember ) ) SELECT { [Measures].[UnemploymentAvg], [Measures].[UnemploymentAvg_PP_Change] } ON COLUMNS , NON EMPTY Hierarchize( {[DimDate].[H-Year].[Calendar Year].Members, [DimDate].[H-Year].[Calendar Qtr].Members, [DimDate].[H-Year].[Month].Members } ) ON ROWS FROM [GovtDebtAnalysis] WHERE ([DimCountry].[Country].[United States]);
Time-Series Analysis Why not just create calculated members? 4/23/2017 5:29 AM Time-Series Analysis Why not just create calculated members? Calculated measure explosion Consider the GovtDebtAnalysis sample cube- 7 (visible) measures 3 time hierarchies, and 2 standalone dimension attributes To provide a Prior-Period and Parallel-Period calculation for each measure (across all time hierarchies and dimension attributes) would require an additional 70 calculated measures Error handling / misleading results Non-trivial to implement, no fun to duplicate The built-in MDX functions (e.g., PrevMember, ParallelPeriod) might seem in of themselves sufficient for time-based calculations. Add a few calculated members to a cube (or even let users do it themselves with PTPower) and problem solved, right? Not exactly. There are serious shortcomings and complexities that can and should be addressed in order to truly empower end users. The shortcomings include: ■Erroneous or misleading results. When there are missing values along part of a time dimension (or the end user is looking at the first set of members in the dimension), calculated members that use functions such as PrevMember or ParallelPeriod will return misleading results. The calculated member definition needs to account for these situations. Typically, cube designers resort to fairly complicated conditional logic (e.g., IIF, CASE) in their MDXs, which can introduce performance problems along with code that's tricky to develop and maintain. ■Accounting for both attribute dimensions and dimension hierarchies. SSAS 2005 introduced the notion of dimension attributes which, from a time perspective, means end users can look at values by attribute (e.g., flat list of years) or by hierarchy (e.g., Year-Month-Day). For example, look at the DimDate dimension in our example database. End users can view measures by several attributes, including Century, Calendar Year, and Calendar Quarter. Or, they may choose to leverage a hierarchy (e.g., H-Year). The calculated member examples we've reviewed are specific to either a hierarchy or to a level. Ideally, period-to-period and same-period-last-year calculations should work regardless of what attribute or level the end user selects. (Note that a Microsoft design "best practice" recommends hiding attributes that participate in a hierarchy, but this is more of a general guideline. It doesn't address situations in which there is one or more time attributes that don't participate in a hierarchy, which is the situation in our case.) ■Calculated measure explosion. There are seven visible measures in the GovtDebtAnalysis cube. If you want to provide a period-to-period and same-period-last-year calculation for each visible measure, you now have a minimum of 21 visible measures in the cube (7 measures + 14 calculated members). However, you also have three time hierarchies. This means you'll actually need 42 (3 x 14) calculated members, which brings a new total of 49 measures (7 measures + 42 calculated members). But this total still doesn't account for visible dimension attributes. There are five in the cube, so you need an additional 70 calculated members. Finally, end users might want to see period-to-period and same-period-last-year calculations in three different forms (e.g., prior period value, change in prior period value, and change represented as a percent). You now have a cube that's bloated and difficult to understand and query. © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Time Utility Dimension A time utility dimension can address the problem of calculated measure explosion Centralizes many types of time-series analysis Not a new approach – implemented prior to Analysis Services 2005. Starts with the addition of a new single-member dimension Create a view (in the DSV) to represent the dimension Member contains a static value such as [Current] Additional calculated members are added to the dimension attribute for each type of analysis i.e. via the Cube’s Calculation Script
demo “Classic” Time Utility Dimension
Time Utility Dimension Issues/Limitations Calculations are limited to a single time dimension hierarchy May not work for other hierarchies (and standalone dimension attributes) Role-playing dimensions (based on time) Error handling (missing data at tail and head of time)
The BI Wizard – Time Intelligence Available since SSAS 2005, built-in implementation of a time utility dimension Adds a named calculation (e.g. H-Year DimDate Calculations) to the Date dimension table (in the DSV) with a static value Similarly named attribute is added to the Date dimension Calculated members are added to the attribute (Some) error handling logic is included in calculations Assignments used in member definitions
demo BI Wizard Time Intelligence
What is a Shell Dimension A Dimension with ONE attribute with ONE member. Every record in the fact table is mapped to the ONE member. This ONE member becomes the ANCHOR member. Allows modification of MDX-instead of specifying a measure, we can specify the anchor. GDP PrYr Chg = ([Time].[Calendar Hierarchy].CurrentMember.Lag(1), [Measures].[GDP]), (ParallelPeriod([Time].[Calendar Hierarchy].[Year], 1), [Measures].[Sales Amount])) GDP PrYr Chg = ([Time].[Calendar Hierarchy].CurrentMember.Lag(1), [Anchor Member]), (ParallelPeriod([Time].[Calendar Hierarchy].[Year], 1), [Anchor Member])) The use of a “shell dimension” solves this problem. A “shell dimension” is a dimension with one member that every record in the fact table is mapped against. Think of it as a single member that everything in the cube is “anchored” to. This allows the modification of the MDX so that rather than specifying a measure, we can instead specify this one member as the anchor. Sales Amount Pr Yr Chg = ([Time].[Calendar Hierarchy].CurrentMember.Lag(1), [Measures].[Sales Amount]), (ParallelPeriod([Time].[Calendar Hierarchy].[Year], 1), [Measures].[Sales Amount])) In the MDX script for “Sales Amount Pr Yr Chg”, the formula is saying “Go back one year for Sales Amount and grab the variance”. The “what” in the formula is [Measures].[Sales Amount]. If we create a time calculation shell dimension which has one member called “Current Period”, then we could use that in the “what” portion of the formula. ([Time].[Calendar Hierarchy].CurrentMember.Lag(1), [Time Calculations].[Current Period]), (ParallelPeriod([Time].[Calendar Hierarchy].[Year], 1), [Time Calculations].[Current Period])) This formula now is saying “Go back one year from the current period”. Because we are no longer specifying a particular measure, the calculation will calculate for all measures. The trick here is to change the “anchor” of the formula from a measure to the base member of the “shell dimension”. This new calculation also has to reside on the shell dimension and not on the measures dimension.
The BI Wizard – Time Intelligence Issues/Limitations Calculations are limited to a single time dimension hierarchy May not work for other hierarchies (and standalone dimension attributes) Role-playing dimensions (based on time) Error handling (missing data at tail and head of time) Somewhat verbose and fixed Year Over Year Growth = Parallel-Period Quarter Over Quarter and Month Over Month are both PriorPeriod – what about days, decades, centuries, etc.? Advantages to Using Built-In Time Intelligence • The template that “loads” the wizard is an xml file that can be modified. It is located at c:\Program Files\Microsoft SQL Server\90\Tools\Templates\olap\1033\TimeIntelligence.xml • Allows developers to limit the measures that are included in the scope of the calculations. (Although now that the Aggregate function properly calculates ratios and distinct count measures, scoping is not as much of an issue as it was in AS2000). • It is pretty easy. Disadvantages to Using Built-In Time Intelligence • Measures are “scoped” in the MDX. This means that any new measures or calculations that are added to cube which should be included in the time calculations will force a change in all of the time calculation formulas. • The scope command is buried into each calculation rather than having one starting scope command that includes all of the time series calculations in a global statement. • The calculations default to “NA” rather than NULL. This means that end user tools which suppress empty rows will not be able to suppress the empty rows because they are loaded with “NA” values. This also slows down the SSAS calculation engine. • A new “time calculation” dimension has to be created for each user defined hierarchy. This also means a new attribute dimension and data source view calculation are also created for each user defined time hierarchy. • Some very common and useful calculations are missing from the wizard, such as year to date variances.
New approach An enhanced time utility dimension Based on concepts from: David’s Shroyer’s white paper "A Different Approach to Implementing Time Calculations in SSAS“ Mosha Pasumansky’s blog entry "Time Calculations in UDM: Parallel Period“ (Note – Mosha points out a few limitations and performance problems with David’s approach – some of which David has since addressed) Use David’s framework for creating a time utility dimension Integrate performance optimizations from Mosha’s whitepaper Use an explicit (attribute) based method of scoping the calculations across the entire time dimension And account for role playing dimensions (based on the time dimension) via script duplication
New approach Setting it up Add a new dimension attribute to the existing time/date dimension. Add a named calculation (e.g. TimeCalcs) in the Date dimension table (in the DSV); assign a static value (Current) Create a new dimension. Create a shared dimension (DimTimeCalcs); the dimension contains no hierarchies (just a single attribute – TimeCalcs) Set the TimeCalcs attribute IsAggregatable property = false, and change the DefaultMember to [Current] Add the new dimension to the cube. Set the relationship type (for each measure group) to referenced Set the HierarchyUniqueNameStyle property to ExcludeDimensionName (to simplify later MDX references)
New approach Setting it up Continued Add the time calculations to the cube. Define PriorPeriod and YearAgo members with a default Null value Scope calculation to each desired dimension attribute e.g. CREATE MEMBER CURRENTCUBE.[TimeCalcs].[PriorPeriod] AS Null; CREATE MEMBER CURRENTCUBE.[TimeCalcs].[YearAgo] AS Null; /* Define a Prior Period value for the Century attribute. Note that we do not define a YearAgo value. */ Scope( [DimDate].[Century].[Century].Members ); -- Prior Period calculation ( [TimeCalcs].[PriorPeriod] = ( [DimDate].[Century].CurrentMember.PrevMember ,[TimeCalcs].[Current] ) End Scope;
New approach Setting it up Continued Scope( [DimDate].[Calendar Qtr].[Calendar Qtr].Members); -- Prior Period ( ( [TimeCalcs].[PriorPeriod] , { [Measures].[CPI_Base2005], [Measures].[UnemploymentAvg] } ) = ( [DimDate].[Calendar Qtr].CurrentMember.PrevMember ,[TimeCalcs].[Current] ) ); -- Year Ago ( ( [TimeCalcs].[YearAgo] , ( [DimDate].[Calendar Qtr].CurrentMember.Lag(4) End Scope;
New Approach Handling role-playing dimensions Tried a few different ways of working with role-playing dimensions, but the approach that worked….Copy and Paste Specifically, copy and paste all Scope statements, replacing the dimension name with role playing dimension name /* Define a Prior Period value for the Century attribute. Note that we do not define a YearAgo value. */ Scope( [ShipDate].[Century].[Century].Members ); -- Prior Period calculation ( [TimeCalcs].[PriorPeriod] = ( [ShipDate].[Century].CurrentMember.PrevMember ,[TimeCalcs].[Current] ) End Scope;
demo “New” Time Utility Dimension
New approach Scoping on [all of] the dimension attributes Scoping on all dimension attributes may seem like overkill E.g. When attribute relationships (and hierarchies?) are properly configured in the Date dimension, I can scope on the Date attribute - and rely on coordinate overwrite rules to propagate calculations up to month, quarter, semester, and year However, building, understanding, and relying on attribute relationships (and understanding coordinate overwrite rules) is difficult Try taking the default AdventureWorks 2008 DW implementation. Notice how the attribute relationships are configured for Fiscal Quarter and Semester – calculations don’t work without explicit attribute definitions on these (and other) attributes Trying to implement an approach that will work well with new and existing cubes E.g. AdventureWorks 2008 DW’s Date Dimension starting July 1
Other Considerations De-selection of Calculated Members Analysis Services 2005 SP2 (and 2008) breaks Excel 2007/2010 calculated member selection Calculated members cannot be deselected Multi-member utility dimensions can be unwieldy
Other Considerations De-selection Workarounds Marco Russo’s DateTool Dimension Project Convert calculated members into “real” members (via a Union Join in the DSV view defining the dimension) Then override in the calculation script of the cube Interesting implementation of the time utility dimension – no relationship to any of the measure groups Worked well for my sample cube, but not with AdventureWorks 2008 sample Named Sets Excel 2010 lets a user define them; otherwise you can define them in the cube Not a great solution - but it can help What Else? http://www.sqlbi.eu/Default.aspx?tabid=87
Related Content March 2010 Edition of SQL Server Magazine 4/23/2017 Related Content March 2010 Edition of SQL Server Magazine www.sqlmag.com/article/sql-server-analysis-services/optimizing-time-based-calculations-in-ssas-.aspx Additional Web only article provides an introduction to time-series analysis David Shroyer’s whitepaper http://www.obs3.com/pdf/A%20Different%20Approach%20to%20Time%20Calculations%20in%20SSAS.pdf Mosha Pasumansky’s optimizations for David’s ParallelPeriod calculations http://sqlblog.com/blogs/mosha/archive/2006/10/25/time-calculations-in-udm-parallel-period.aspx PTPower Utility for Creating Calculated Members in Excel 2007/2010 www.sqlserverpower.com/UtilityDetail/PTPower.aspx © 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. 22