Report Design for SSAS Cubes and MDX Paul Turley Mentor, SQL Server MVP
Introduction Paul Turley SqlServerBiBlog.com Mentor, SQL Server MVP BID 302 | MDX Essentials for Report Design
What Can You Do with a Cube? Aggregate very large volumes of data Destroy anything in its path Present browse-able business information for self-service reporting Assimilate entire civilizations Create high-value business reports that render in a fraction of the time of a relational data source Create a mega race of neo-humanoid androids with a single collective consciousness Encapsulate complex business rules into predefined hierarchies, calculations, business measures and KPIs BID 302 | MDX Essentials for Report Design HC-850 BI Solutions 2008
The Business Data Continuum Relational Data Warehouse Data Consolidation & Transformation (ETL) OLAP Cubes The business data continuum consists of multiple phases. All of these components are not necessarily required for every business intelligence solution. There may also be variations to this process depending on the size, scope and needs of the business. Generally, data must be consolidated from a variety of sources. These operational data sources may consist of various production databases and may consist of nontraditional data sources such as flat text files, delimited export data, Excel spreadsheets and other desktop documents. An automated data transformation tool, such as Microsoft SQL Server 2008 Integration Services, may be used to merge, cleanse and integrate data from these multiple sources into a central, reliable data repository for reporting. Cyclical data loads may be automated so that reporting data is updated and available at regular intervals. A relational data warehouse or data marts house business reporting data in simplified structures. The purpose of this database is to store business information, optimized for retrieval, reporting and analytics. OLAP cubes contain predefined relationships, navigational patterns and dimensional hierarchies. Business facts are strategically aggregated and calculations may be used to define the business rules and complex algorithms. Cubes may be browsed by users without the need for special skills and complex reporting tools. Reports, charts, dashboards, spreadsheets, pivot tables and KPI scorecards are based on OLAP cubes or data warehouse data. Operational Databases Reports, Charts, Dashboards & Scorecards BID 302 | MDX Essentials for Report Design
Dimensional Data Warehouse Design Date Dimension Employee Dimension Geography Dimension Customer Dimension Product Dimension Vendor Dimension Sales Fact BID 302 | MDX Essentials for Report Design
Contrasting Data Source Performance Using a transactional data source 500,000 records… 20 minutes to run… Here’s a common scenario to help reinforce the value of an effective business intelligence solution. As business systems evolve, databases become larger and more complex. Reports that were once simple and fast become slow and inefficient. Complex queries consist of many tables, relationships and business logic. This is because transactional databases are often very complex. Queries based on these databases are often slow in the error-prone. As the systems evolve, development becomes cumbersome and expensive. Report designers and developers spend more of their time writing queries and performing debugging tasks rather than developing reports. In this example, 500,000 records are returned in 20 minutes and our user is not having a good experience. BID 302 | MDX Essentials for Report Design
Contrasting Data Source Performance Using an OLAP cube 100,000,000 source records… 2 seconds to run query… By contrast, let’s look at a report running against an OLAP cube. The cube is based on a data warehouse that contains 100,000,000 source fact records. The dimensions of the cube are predefined with hierarchies and relationships to support our business reporting requirements. Many of the business facts are pre aggregated and optimized for reporting. The cube is processed overnight with fresh data. Because the report designer spent little time writing queries, this afforded more time to design an effective report with charts, graphs and other specialized visualizations. When the user runs the report, it literally takes only seconds to run. This user has a much better reporting experience with the report that provides more business value. As a result, he is able to take action, make informed decisions and run an efficient and profitable business. Our user thinks that rocks! BID 302 | MDX Essentials for Report Design
Cube Design Process HC-850 BI Solutions 2008 Create Data Source View Design Dimensions & Hierarchies Create cube & Dimensional Usage Format Measures Create Calculations Design Attribute Relationships Design Partitions Design Aggregations Basic cube design ends after the third step, when the cube is created. To move into the more advanced cube functions, we need to look into the business needs. Partitions allow the cube to be stored in different files, potentially on different storage devices, and can be used to aggregate at different levels: Ex. Legacy data from years past can be optimized for a higher level of granularity. BID 302 | MDX Essentials for Report Design HC-850 BI Solutions 2008
Dimensions Dimension > Hierarchy > Level > Member In Microsoft SQL Server Analysis Services, a time dimension is a dimension type whose attributes represent time periods, such as years, semesters, quarters, months, and days. The periods in a time dimension provide time-based levels of granularity for analysis and reporting. The attributes are organized in hierarchies, and the granularity of the time dimension is determined largely by the business and reporting requirements for historical data. For example, most financial and sales data in business intelligence applications use a monthly or quarterly granularity. It is important to remember that all measures will be calculated across the lowest level of granularity. There’s practically no such thing as an OLAP database without a Time dimension. Often, a Time dimension contains months as the lowest level of detail—aggregated into quarters and years. Sometimes, a Time dimension will contain days as the lowest level of detail. On occasion, particularly if you’re monitoring a manufacturing operation or Internet activity, you might create a dimension with minutes or even seconds as the lowest level of detail. Whatever the level of detail, a Time dimension has certain unique qualities. For example, time typically occurs in regular intervals. Each hour contains 60 minutes, each day contains 24 hours, each quarter contains 3 months, and each year contains 4 quarters. This repetitive nature of time encourages certain questions, such as, “How does this month compare to the same month of last year?” The multidimensional expressions (MDX) language, which you’ll learn about in Chapter 8, “Using MDX,” has functions that make it easy to answer this type of question. By flagging certain dimensions as Time dimensions, and certain levels within a dimension as specific units of time, you can make those functions easy to use. Of course, time isn’t completely uniform because the 365 days in a year aren’t evenly divisible by the 7 days in a week or the 12 months in a year. Some months have 30 days; some have 31, or 28, or occasionally 29. Months begin on different days of the week. Irregularities are a fact of life in Time dimensions, and when working with time, you need to be prepared for both the regularities and the irregularities. One irregularity that frequently arises when dealing with time is that many organizations use a fiscal year, where the starting day of the year isn’t January 1. Analysis Services can build a Time dimension based on a specified date range and add a special calendar to this dimension for Fiscal Year. For greater flexibility, you should use a dimension table from your data warehouse because you include special properties for a date, such as the season for a month if this information is important to analysis over time in your organization. BID 302 | MDX Essentials for Report Design HC-850 BI Solutions 2008
Measures Organized in measure groups Derived from numeric fields or SQL calculations Calculated members based on MDX scripted functions KPIs based on MDX script for actual/goal, status & trend comparisons BID 302 | MDX Essentials for Report Design
Understanding Aggregate Functions SSAS is optimized to manage pre-defined & strategically-derived aggregations Logical Aggregations Additive Measures Semi-Additive Measures Non-additive Measures Aggregating Financial Accounts In this section, you’ll examine how values are aggregated when browsing a cube. BID 302 | MDX Essentials for Report Design HC-850 BI Solutions 2008
Basic Query Syntax SELECT < member or set > on < Columns | Axis(0) | 0 >, < member or set > on < Rows | Axis(1) | 1 > FROM < cube or subcube expression > WHERE < member or set > ; SELECT { [Sales Amount], [Order Quantity] } on Columns, [Category].Members on Rows FROM [Adventure Works] WHERE [CY 2001] ; BID 302 | MDX Essentials for Report Design
Filtering Slicer SELECT … on Columns, … on Rows FROM < cube name > WHERE { [Category].[Bikes], [Category].[Clothing] } ; Subcube SELECT … on Columns, … on Rows FROM ( SELECT { [Category].[Bikes], [Category].[Clothing] } on 0 FROM < cube name > ) ; BID 302 | MDX Essentials for Report Design
Sets & Tuples Set: Combine members from same hierarchy using braces { [Year].[2005], [Year].[2006] } Tuple: Combine members from different hierarchies using parentheses ( [Category].[Bikes], [Year].[2006] ) BID 302 | MDX Essentials for Report Design
Manual & Generated MDX The Graphical Query Designer Manual Changes Slicers based on sub cubes Multi-select Parameters Dataset-driven lists Levels Manual Changes Query formatting is ugly Can’t go back to the GQD Parameter support is limited BID 302 | MDX Essentials for Report Design
Demo <place holder> BID 302 | MDX Essentials for Report Design
Aggregation & Calculations Leverage the Analysis Services calculation & aggregation engine Reporting Services will perform aggregations out of the box Override default SUM() and FIRST() function Demo Miscalculated & Fixed Calculation BID 302 | MDX Essentials for Report Design
Dynamic MDX Queries The business user / developer dichotomy Expressions Add parameters Custom code function Use calculated members Migrate calculated members to the cube for reuse BID 302 | MDX Essentials for Report Design
Prompts & Parameters Use multi-select whenever possible Standard prompts are most often appropriate Custom prompts can use expressions & string concatenation Date ranges Date picker prompt is designed for day-level selection BID 302 | MDX Essentials for Report Design
Demo <place holder> BID 302 | MDX Essentials for Report Design
Best Practices Use the graphical query design to get started Generate fields, parameters & parameter list datasets Save queries to script files BID 302 | MDX Essentials for Report Design
Questions ? BID 302 | MDX Essentials for Report Design
Thank You Paul’s Blog……SqlServerBiBlog.com Resources Paul’s Blog……SqlServerBiBlog.com SQL Server 2008 MDX Bryan C Smith, Ryan Clay Microsoft Press SQL Server 2008 Analysis Services Scott Cameron Microsoft Press
Complete the Evaluation Form to Win! Win a Dell Mini Netbook – every day – just for submitting your completed form. Each session evaluation form represents a chance to win. Pick up your evaluation form: In each presentation room Online on the PASS Summit website Drop off your completed form: Near the exit of each presentation room At the Registration desk Sponsored by Dell BID 300| Building a Reporting Infrastructure in SharePoint with SSRS 2008 R2
Thank you to our sponsors Gold Blog Prize Bronze BID 302 | MDX Essentials for Report Design