Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Monitoring for SQL Server Analysis Services

Similar presentations


Presentation on theme: "Performance Monitoring for SQL Server Analysis Services"— Presentation transcript:

1 Performance Monitoring for SQL Server Analysis Services
Stan Geiger #492 | Phoenix 2016

2 Platinum Level Sponsors
Gold Level Sponsors Venue Sponsor Key Note Sponsor Pre Conference Sponsor

3 Silver Level Sponsors Bronze Level Sponsors

4 About Me Sr. Product Manager with Idera Geek Sync Presenter
Performance Monitoring of Microsoft BI stack Backup and Recovery of Microsoft SQL Server Geek Sync Presenter Blog Contributor HSSUG presenter Over 25 years experience BI, Data Architect DBA .Net Developer Data Analyst

5 1 1 Architecture 2 Processing 3 Performance Metrics

6 Architecture 2 Semantic Model

7 Semantic Model 3 3 Tabular Introduced in SQL Server 2012
Installation only allows for one model type (OLAP/Multidimensional vs. tabular) Can integrate from multiple data sources not just relational databases Users are able to work with data in multiple ways The BI Semantic Model powers both the entire range of client tool experiences and the full spectrum of BI applications that business users and developers can build using the Microsoft BI stack. This spectrum can include the following scenarios: Personal BI applications that business users create to meet their own specific needs Team BI solutions that business users create and share with their colleagues within small organizations or departments Corporate BI applications that are developed, managed, and sanctioned by corporate IT and rolled out to a large user base The model can be thought of conceptually in three layers: data model, business logic and queries, and data access.

8 Semantic Model 4 Tabular Model
Relational Modeling Constructs (tables, relationships) In-Memory Analytics Engine Higher Memory Requirements Greater Compression On the fly aggregation Multiple data sources The reasons why Tabular was preferred to Multidimensional are performance, maintenance, cost of ownership, and flexibility in data model design. One of the limits of Tabular adoption is the licensing cost, because it requires a Business Intelligence or Enterprise edition of SQL Server. The cheaper Standard edition includes Multidimensional but not Tabular. Tabular modeling organizes data into related tables In tabular mode, you can use the xVelocity (formerly Vertipaq) in-memory engine to load tabular data into memory for fast query response Because of the in-memory engine, memory requirements are a lot higher There is greater data compression up to as much as 10 to 1 Data aggregation is done on the fly as opposed to pre-aggregated Multiple data source types are supported. Basically tabular can process large amounts of data quickly because of the xVelocity engine, however it can see performance issues when aggregating data.

9 Semantic Model 5 Multidimensional
OLAP Model Contructs(cubes, dimensions) Pre-aggregation Relational Data Sources Many to Many Relationships Large data volumes At its core, multidimensional modeling creates cubes composed of measures and dimensions based on data contained in a relational database the OLAP engine uses the multidimensional model to preaggregate large volumes of data to support fast query response times Supports many to many relationships in the data. Because data is pre-aggregated large volumes of data can be handled when querying the data. However, cube processing can see issues with large amounts of data requiring aggregation.

10 Semantic Model 6 Resources
Comparing Tabular and Mutidimensional Solutions (SSAS) Choosing a Tabular or Multidimensional Modeling Experience in SQL Server 2012 Analysis Services Good resource for

11 Processing Architecture 7 Processing is broken into two areas:
Data Refresh Query Processing/Data Retrieval

12 Processing 8 Data Refresh SSMS or Scripted (XMLA)
Cube processing is memory intensive Disk I/O Changes to cube can have big impact on processing Tabular processing primarily refreshes data Multidimensional: By default, processing occurs when you deploy a solution to the server. You can also process all or part of a solution, either ad hoc using tools such as Management Studio or SQL Server Data Tools, or on a schedule using Integration Services and SQL Server Agent. When making a structural change to the model, such as removing a dimension or changing its compatibility level, you will need to process again to synchronize the physical and logical aspects of the model. Tabular More like a relational database. Similar in processing but primarily refreshes data and recalcs calculated columns in the model database.

13 Processing 9 DAX Query Processing (Tabular) Single Threaded
Formula Engine: Handles complex expressions Single threaded Storage Engine: Handles simple expressions Executes queries against tabular database Multithreaded When a DAX query is sent to a Tabular Model, it generate DAX query plan and that is transformed into commands sent to xVelocity Engine. DAX Formula engine perform single threaded operations. xVelocity uses one core per segment. The xVelocity engine can work on multi-threaded operation. So you want to push requests on the xVelocity engine rather than Formula Engines. Probably simpler queries. As the complexity of queries will increase, Formula Engine will be more busy than xVelocity. Multi-threaded

14 Processing 10 MDX Query Processing
Query Parser has XMLA listener and parses query Query Processor prepares execution plan and caches results in FE cache for reuse. FE is single threaded also so each request is processed by one core. FE requests data from the SE, referred to as sub cube data. SE handles requests from Query Processor as: Checks for sub cube data in the SE cache. Checks to see if aggregation data is requested, retrieves from cube and caches. If no aggregation is available and calculates and caches result in SE cache.

15 Processing 11 Query Processing Formula Engine Cache Flat cache
Calculated Cache Storage Engine Dimension (cube) Measure Group (cube) Vertipaq (tabular) FE has two caches: Flat cache (limited to 10% of TotalMemoryLimit property) Calculated cache SE Cache: Dimension cache Measure Group cache Query Processing Query is accepted then its parsed by the Formula Engine. It retrieves whatever data it can for the query from the FE cache. The FE then requests any data it needs from the Storage Engine. The SE retrieves whatever it can from the SE caches. If more data is needed then it retrieves the data from the files system.

16 Architecture 12 Performance Metrics

17 Performance Metrics 13 Sources Dynamic Management Views
Performance Counters PS C:> (Get-Counter -ListSet "MSOLAP*").paths Two areas of performance measures: Dynamic Management Views Performance Counters (SSAS and system) Powershell can be used to get a list of counters.

18 Performance Metrics 14 Categories Network Disk I/O Memory CPU MDX/DAX
You need to monitor the general characteristics of the Server itself.

19 Performance Metrics 15 Network Network Interface: Bytes Received/sec
Network Interface: Bytes Sent/sec Network Interface: Output Queue Length There are performance counters that we can use to identify if a network issue is at the local server level. These counters show the pressure being applied to the local network resources. For the most part, they are an indication if there is backup getting data out to requestors.

20 Performance Metrics 16 Network Processing: Rows read/sec
Storage Engine Query: Rows sent/sec In regards to SSAS, we can break the network traffic out in order to determine what traffic is being sent Processing: Rows read/sec – rows read from SQL Server Storage Engine Query: Rows sent/sec – rate of rows sent to the requestor In this chart we can see that there are considerably more rows being read than being sent by the SE to the client requestor. At this point we could start to look at the performance of Disk I/O which we will look at next.

21 Performance Metrics 17 Disk I/O
Storage Engine Query: Queries from file/sec Storage Engine Query: Data bytes/sec Physical Disk: Disk Read Bytes/sec Cache: Copy Reads/sec Storage Engine Query: Queries from file/sec SSAS uses the file system cache, unlike the database which has its own caching system. So we need to check to see if SSAS is even using its Storage Engine Cache as opposed to the disk subsystem. This counter does not necessarily mean that physical disk I/O is occurring. To determine how much Storage Engine activity is reading from disk as opposed to from the Windows file cache you need to also look at: MSAS: Storage Engine Query: Queries from file/sec – Rate of queries answered from files. MSAS: Storage Engine Query: Data bytes/sec - Bytes read from the Data file. Physical Disk: Disk Read Bytes/sec – total bytes read from disk Cache: Copy Reads/sec - rate at which the file system attempts to find data in the cache without accessing the disk. Looking at the chart you see that file queries per second is high along with the data reads(kb) at around 20 to 25MB. This along with disk reads(kb) being relatively high show that there is high degree of data being read from disk as opposed to cache. When memory becomes scarce and working sets are trimmed, the cache is trimmed as well. If the cache grows too small, cache-sensitive processes are slowed by disk operations.

22 Performance Metrics 18 Disk I/O Physical Disk: Avg. Disk sec/Read
Physical Disk: Avg. Disk sec/Write Physical Disk: Avg. Disk sec/Read - average time the disk reads took to complete, in seconds Physical Disk: Avg. Disk sec/Write - average time the disk write took to complete, in seconds These counters are used to determine the health of the disk subsystem. If these values are consistently above 10 to 30 ms then there will be performance degradation on the disk subsystem. A high value for this counter might mean that the system is retrying requests due to lengthy queuing or, less commonly, disk failures. 

23 Performance Metrics 19 Memory Overall Memory used by SSAS
SSAS Memory Efficiency SSAS Memory Activity 3 key areas to look at concerning memory pressure on SSAS

24 Performance Metrics 20 Memory Usage Memory: Cleaner Memory KB
Memory: Cleaner Memory shrinkable KB Memory: Cleaner Memory nonshrinkable KB When it comes to memory management SSAS is entirely self-governing, and unlike SQL Server it doesn’t consider external low physical memory conditions (which can be signaled by Windows) or low VAS memory.  This may be partly because SSAS is already much more subject to Windows own memory management than is SQL Server, since Analysis Services databases are a collection of files on the file system and can make heavy use of the file system cache, whereas SQL Server does not. SSAS has a special memory “cleaner” background thread that constantly determines if it needs to clean up memory. The cleaner looks at the amount of memory used. SSAS has two general categories of memory, shrinkable and nonshrinkable, and they work pretty much like it sounds. Shrinkable memory can be easily reduced and returned back to the OS. Nonshrinkable memory, on the other hand, is generally used for more essential system-related stuff such as memory allocators and metadata objects, and is not easily reduced. Memory: Cleaner Memory KB - Amount of memory, in KB, known to the background cleaner. Memory: Cleaner Memory shrinkable KB - Amount of memory, in KB, subject to purging by the background cleaner. Memory: Cleaner Memory nonshrinkable KB - Amount of memory, in KB, not subject to purging by the background cleaner.

25 Performance Metrics 21 Memory Usage Memory: Memory Limit Low KB
Memory: Memory Limit High KB SSAS uses memory limit settings to determine how it allocates and manages its internal memory. Memory\LowMemoryLimit defaults to 65% of the total available physical memory on the machine (75% on AS2005) Memory\TotalMemoryLimit (also sometimes called the High Memory Limit) defaults to 80%. Once memory usage hits the Low limit, memory cleaner threads will kick in and start moving data out of memory in a relatively non-aggressive fashion. If memory hits the Total limit, the cleaner goes into crisis mode… it spawns additional threads and gets much more aggressive about memory cleanup, and this can dramatically impact performance. This graph shows that the cleaner process will start moving data out of memory.

26 Performance Metrics 22 Memory Efficiency
Calculation cache lookups/sec, hits/sec Flat cache lookups/sec, hits/sec Dimension cache lookups/sec, hits/sec Measure group cache lookups/sec, hits/sec We talked about the Formula (FE) and Storage Engine (SE) caches. There is a Calculation Cache and Flat Cache for the FE, with Dimension and Measure Group caches for SE. To determine how efficient the cache is we can look at the ratio between lookups and cache hits. Graph 1, shows that the ratio of lookups to hits for calculation cache is near 1 which means that data is moving from cache and not disk. Graph 2, shows that the ratio of lookups to hits for the Dimension Cache is fairly high, we can also see that from the graph lookups and hits. This means that more than likely SSAS is having to get data from disk because the SE cache is not being utilized.

27 Performance Metrics 23 Memory Cache Improvements
Configure memory limits appropriately Warm the cache Improperly configured memory limits is the most common cause of performance issues A cold cache could also be the culprit. This is caused by the cache being flushed due to cube processing. The first user to hit the cube after processing completes would get degraded. A cold cache causes queries to take a long time to complete as data is read from disk. The solution to this is to warm the cache after processing the cube. Increase the size of the paging files on the Analysis Services server or add additional memory to prevent out–of-memory errors when the amount of virtual memory allocated exceeds the amount of physical memory on the Analysis Services server. Reduce the value for the Memory/LowMemoryLimit property below 75 percent when running multiple instances of Analysis Services or when running other applications on the same computer. Reduce the value for the Memory/TotalMemoryLimit property below 80 percent when running multiple instances of Analysis Services or when running other applications on the same computer. Keep a gap between the Memory/LowMemoryLimit property and the Memory/TotalMemoryLimit property – 20 percent is frequently used.

28 Performance Metrics 24 CPU Processor: % Processor
System: Context Switches/sec System: Processor Queue Length If the CPU is overburdened then all processes on the server are going to be impacted. There is no rule of thumb as far as what CPU utilization “should” be. You want to get the most out of the CPU without overloading the system. We therefore look for extended periods where the CPU is at 100% utilization We also need to look at the number of parallel processes that are getting spawned on the server.  When the number of threads remain high for a sustained period of time it’s an indication of degraded performance.  System: Context Switches/sec – the rate the CPU is switching between threads System: Processor Queue Length – the number of threads waiting to process  Graph shows possible contention as both context switch and queue length are high. This could mean that CPU is switching a lot of threads in and out while there is a large number of threads waiting in the queue.

29 Performance Metrics 25 CPU Threads: Query Pool Queue Length
Threads: Query pool (FE) Threads: Processing pool (SE) There are a couple of performance counter categories available for SSAS that give a key indication of processor utilization within SSAS. These counters deal with the formula and storage engine. Threads: Query pool (FE) – number of busy threads Threads: Processing pool (SE) – number of threads running I/O jobs in the processing pool Threads: Query pool job queue length - Number of jobs in the queue of the query thread pool. Query pool refers to the activity in FE. The processing pool refers to activity within the SE. This gives us a view of the thread activity within both engines. If the thread queue length is non-zero then SSAS does not have enough threads to satisfy the query requests. You want CPU utilization to be as efficient as possible. One measure of efficiency is look at Queue Length vs. %CPU Utilization. We can see from the chart that there are extended periods with a high number of threads in the queue, while %CPU utilization is below 30%.  If this pattern occurs over time you will want to adjust the MaxThreads and/or CoordinatorExecutionMode properties of the SSAS instance.  TheMaxThreads property controls the maximum number of threads that are included in the thread pool.  The CoordinatorExecutionMode controls the maximum number of parallel jobs per CPU. Relief: MaxThreads CoordinatorExecutionMode

30 vs Performance Metrics 26 Query Processing MDX: Total cells calculated
MDX: Number of calculation covers MDX: Total Sonar subcubes “Cell by cell” vs. “block oriented” MDX is the query language used to query a multidimensional cube. Just like with SQL there are ways to write unoptimized code. In SQL we want to do set based operations as much as possible as opposed to Row by Agonizing Row (RBAR) operations. MDX has a similar construct, referred as “cell by cell” vs. “block-oriented”. To look at whether MDX queries are doing a high number of cell by cell calculations we can look at these counters: MDX: Total cells calculated - Total number of cell properties calculated. MDX: Number of calculation covers - Total number of evaluation nodes built by MDX execution plans, including active and cached. A high value for this counter for a single query indicates that the formula engine resolved cell values for the query on a cell-by-cell basis MDX: Total Sonar subcubes - The number of Sonar subcubes is smaller when the formula engine is able to retrieve data from the caches rather than from disk. High numbers for any of these metrics indicate that queries running are using cell by cell calculations. vs

31 Performance Metrics 27 Query Processing MDX: Total recomputes
MDX: Total NON EMPTY unoptimized MDX: Total NON EMPTY for calculated members A non-zero value indicates that there are errors in the query calculation. These errors cause recomputes. If there are too many recomputes occurring the engine will revert to cell by cell calculation in an effort to overcome the errors. MDX: Total NON EMPTY unoptimized - total number of times the unoptimized NON EMPTY algorithm is used. MDX: Total NON EMPTY for calculated members - the total number of times the NON EMPTY algorithm is looping over calculated members. MDX: Total Recomputes - These metrics show the number of calculate operations that are using a non-optimized algorithm.   If these numbers are increasing you want to take a look at the queries for optimization.

32 Performance Metrics 28 Query Optimization
Use MDX constructs the optimizer resolves to block oriented processing Proper Aggregations Improves query performance Use Non-Empty option on MDX queries where possible Effective partition strategy Improves Storage Engine Performance To improve MDX performance, the two key things to look at are proper aggregations in the cube as well as a good partition strategy.  Partitioning will improve Storage Engine performance.  Aggregations provide query performance improvement because this is data that has been pre-calculated. Performance issues found in the Formula Engine are usually the result of cell by cell calculation.   Avoiding cell by cell calculations in queries along with optimizing for non-empty cells will increase performance in the FE.

33 Performance Metrics 28 Cube Processing Processing: Rows written/sec
Proc Aggregations: Rows created/sec Proc Indexes: Rows/sec Track over time to determine effectiveness or impact of changes to the cube structure. This typically involves loading new data and re-aggregating the data. As you can imagine during processing we are looking primarily at the Storage Engine. The SE is multi-threaded so as threads are spawned resources are consumed. The key metrics to look at when processing a cube are: Processing: Rows written/sec – rate of rows written during processing. Proc Aggregations: Rows created/sec – rate aggregation rows are created during processing. Proc Indexes: Rows/sec – rate of rows from MOLAP files used to create indexes. If these values are non-zero then the cube is being processed. Historical tracking of these values will show the performance of cube processing over time.   This is especially helpful in determining the effect of changes to cube processing.

34 Email: stan.geiger@idera.com Twitter: @MSBI_Stan


Download ppt "Performance Monitoring for SQL Server Analysis Services"

Similar presentations


Ads by Google