Processing Analysis Services Tabular Models Brett Powell
About Me BI Consultant Frontline Analytics Author Power BI Cookbook Mastering Power BI Blog Insight Quest PUG Leader Boston BI Contact Info: @BrettPowell76 Brett.Powell@FrontlineAnalytics.net
Session Agenda Tabular Model Objects and Processing Concepts Processing Patterns Orchestration Tools and Methods Tuning and Optimization
Tabular Model Objects and Processing Concepts
What is an Analysis Services Tabular Model? BI Semantic Layer Intuitive; Self-Service Performance Version Control Security Analytics Scalability Common Microsoft BI Model Power BI Azure Analysis Services SQL Server Analysis Services Tabular Project in Visual Studio Power BI Fields List
Why Process the Tabular Model? Maximum Performance Columnar, In-Memory, Compression Maximum Analytical Flexibility No Restrictions on DAX Functions Maximum Data Integration Flexibility Power Query (M) Limited by Size (RAM) and Data Freshness Requirement Composite Models and/or Aggregations address common tradeoffs
Tabular Data Storage Data is fully memory resident Columns stored separately Dictionary Encoded (Hash or Value) Columns are compressed Compression driven by cardinality (uniqueness) of column Delivery Order ID Date Quantity Price Amount 1 10/17/2018 3 $2.00 $6.00 2 10/18/2018 $2.50 $5.00 $3.00 4 10/20/2018
Processing Objectives and Constraints Time Window Availability for Queries Tabular Server Resources Query Performance Source System Resources Manageability Better Compression More Time to Process Reduced Tabular Server Resources Reduced Availability for Queries Faster Processing via Parallelization Greater Tabular Server and Source Resources
Phase 1: Process Data & Dictionary Processing Phases Phase 1: Process Data & Dictionary Phase 2: Make Queryable Read Data Encode Compress Recalculate Query source system Load via Power Query Stream data to Analysis Services if possible Dictionary of distinct values Value Encoding Hash Encoding Run-length Encoding (RLE) Pointers to rows with value A (e.g. 17 to 25) Derived Structures Relationships Hierarchies Calculated Columns and Tables
Tabular Model Processing Internals Column dictionaries Column segments (data) Calculated columns (DAX) Computed after compression Hierarchies User and System MDX Queries Relationships Size drives query performance Partitions 1+ Segment 2 3 4 Two types of objects processed: Data: Source Data, Column Dictionaries Derived Structures: Relationships, Hierarchies, DAX Calculated Columns or Tables 1 5 Example: Sales Fact Table with 100M rows 5 Partitions (2014 – 2018), 20M rows per partition Internally: 15 segments (max of 8M/segment; default)
Processing Operations Transactions Split processing to reduce resource usage Parallelization Enabled by default Tables and Partitions Configurable via TMSL Operation Database Table Partition Full Available Data Not Available Clear Recalc Default Add Defrag Full Data Clear Default Add Defrag Process Operations by Tabular Object
Table Partitions Defined Rows of a Table Primary Use Case: 1+ partition per table 1+ segment per partition 8M rows per segment (default) compressed 1 CPU Core per segment in parallelized queries Avoid Over-Partitioning Primary Use Case: Reduce processing time and resources Improve manageability Other Use Case: Consolidate source data Partition Manager in SQL Server Data Tools (SSDT) Code View of Model.bim in SSDT
Processing Patterns
Processing Patterns Overview Patterns are primarily driven by: Processing time window Availability of model for queries Available RAM during processing job Other Factors: Impact on source systems Available skills (PowerShell, TMSL) Performance Optimizations Manageability Simple Intermediate Advanced Model Size Small Medium Large Transactions 1-2 2-5+ Many Operation Types Full Only Mixed – Data & Recalc Diverse, Dynamic Partitions None Years, Months, or Weeks Days, Multiple Grains Scripting None to Minimal Cmdlets or Simple TMSL Custom Automation N/A Periodic Maintenance Fully Automated Tabular Processing Job Patterns
Processing Approach Examples Single Process Full of Database Maximum memory (2X+ of database) and time to process Multiple Process Full Transactions Maintain availability for queries, Reduce memory required for process Multiple Process Data and Process Recalc Transactions Eliminate unnecessary Recalc operations Include Recalc in same transaction to maintain availability Process Clear Transaction followed by Process Full or Data + Calc Significantly reduce memory of processing
Best Practice: Keep Model Queryable Eliminate unavailability, even if refresh process fails Process Data and Clear transactions result in unavailability Two options to implement: Process Full operations exclusively Include Process Recalc in same transaction as Process Data or Clear Secondary Option: Process Recalc in transaction immediately following Process Data or Clear Unavailability period limited to duration of Process Recalc
Tabular Model Scripting Language (TMSL) Command and Object Model Syntax for Tabular databases 1200 Compatibility Level+ (SSAS 2016) Refresh Command Type parameter (“full”, “dataOnly”,…) Sequence Command Batch Mode: Multiple operations in single transaction Some model properties exclusive to TMSL Store TMSL Scripts in XMLA Files Reference XMLA files from PowerShell Provides full control over processing Multiple process operations in single transaction Control/limit parallelization TMSL Reference Code View of Analysis Services Project in Visual Studio TMSL in XMLA Query File
Analysis Services PowerShell Cmdlets Analysis Services cmdlets included in SQLServer module Invoke-ASCmd to pass custom TMSL to server Can blend PowerShell dynamic logic/variables with TMSL Analysis Services PowerShell Reference Analysis Services Processing Cmdlets in SQL Server Module In-line TMSL Command via Invoke-ASCmd Cmdlet
Bonus Demo: Last Processed & Row Counts Query DMV to Retrieve Last Refreshed Any table in model Adjust for UTC Time Query model to retrieve row counts Pass DAX Query to Tabular Model DMV Metadata DAX Row Count Query Power Query Editor
Getting Started with Processing SSMS Demo: Queryable State Process Full Scripting Out example Agent Jobs Insert image from SSMS
Tabular Model Processing Examples Intermediate Examples: Weekly, Monthly, Yearly Partitions PowerShell only PowerShell and TMSL Dynamic Partition Processing Azure AS or SSAS
Orchestration Tools and Methods
SQL Server Integration Services (SSIS) Align Processing with Data Warehouse ETL/ELT Leverage SSIS features (checkpoints, logging, email…) Analysis Services Tasks Analysis Services Processing Task Analysis Services Execute DDL Tasks Pass custom TMSL Command (inline or file) PowerShell script via Execute Process task Use with Azure AS or SSAS SSIS Package with Processing Tasks
Tuning and Optimization
Bonus Demo: Memory Cost Analysis Power BI report to analyze cost of: Power BI Premium Azure Analysis Services Bookmarks and What-if slicers Memory sizes and pricing variables embedded in PBIX Memory Cost Analysis Page: Power BI Premium
Top Model Optimization Techniques Remove or split high cardinality columns Low cardinality columns compress better; lower memory Examples: Order Number, Timestamp columns See SQL BI blog for examples Remove or replace calculated columns on fact tables Calculated columns do not compress like standard columns Move logic to data source or Power Query if possible Size partitions to fill segments 8M+ rows default 1 CPU core per Segment in queries Fixed decimal Number (Currency) over Decimal Number if possible Experiment with custom sort order based on importance of columns Evaluate with Vertipaq Analyzer; ensure resources are available for sort
Tabular Optimization Features Value Encoding Hint IsAvailableInMDX Query Memory Limit Default Segment Size Value Encoding Hint in Visual Studio isAvailableInMdx Property in Visual Studio Analysis Services Instance Settings in SSMS
Vertipaq Analyzer: Diagram View in Excel Power Pivot for Excel tool for evaluating memory composition of Tabular models Retrieves DMV data DISCOVER_STORAGE views Dedicated report pages Tables, Compression Encoding, Data Types Vertipaq Analyzer: Diagram View in Excel
Don’t forget to join your local PUG to enjoy year-round networking and learning.