Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance Investigations with Analysis Services 2012

Similar presentations


Presentation on theme: "Performance Investigations with Analysis Services 2012"— Presentation transcript:

1 Performance Investigations with Analysis Services 2012
Akshai Mirchandani Principal Software Design Engineer

2 Session Overview Tabular versus Multidimensional: a very brief overview Performance investigation tools and techniques Formula engine performance Frequently asked questions/issues Common pitfalls for processing performance Tuning Analysis Services for high end hardware Q&A

3 Tabular versus Multidimensional
Common Configuring overall server memory Heap Type Tuning query thread pools Data source performance Tuning MDX queries

4 Tabular versus Multidimensional
Optimizing disk system Processing performance Parallel processing of partitions Performance of building aggregations Performance of calculated members, scope assignments, unary operators, and other advanced calculations Tuning IO thread pools (SSAS 2012) Multi-user settings (CoordinatorQueryBalancingFactor and CoordinatorQueryBoostPriorityLevel)

5 Tabular versus Multidimensional
Memory, memory, memory… DAX query performance No parallel processing of partitions (yet) Query performance versus Processing performance? Adjusting compression levels lets you favor processing time versus query time and vice versa Hardware: CPU and Memory performance Much more noticeable for an in-memory system – just like disk performance is more of an issue for MOLAP Often the smaller systems show significantly better performance No NUMA issues, and the server-grade CPUs take longer to become available

6 Performance Investigation Tools
Profiler XEvents Lighter weight than Profiler You can merge the events with a Windows ETW trace and look at CPU, I/O together was SSAS events Currently the set of XEvents is the same as Profiler events – but in the future newer XEvents should show up MDX Use MDX queries and MDX calculation modifications to identify/isolate performance problems DMVs System DMVs like DISCOVER_PARTITION_STAT, DISCOVER_MEMORYUSAGE, DISCOVER_OBJECT_ACTIVITY. ETW (Event Tracing for Windows) This is a Windows capability – to efficiently gather fine-grained information about CPU, I/O, wait states, etc. This can show which partition/aggregation files are “hot”, and perhaps you can tune them to be on fast disks

7 Demo XEvents Profiler SSAS 2012 Calculation Events Sample tool

8 Identifying Query Problems
Identify the queries that take a long time to run Test each query in isolation Determine if the issue is primarily in the Formula Engine (FE) or the Storage Engine (SE) Test the parts of the query Ensure to test cold/warm cache Ensure to test after running clean cache After identifying individual query performance, ensure you are also running concurrency query tests Here are the basic steps for identifying query problems

9 Identifying the Long Duration Queries
Run a trace that saves to a database table Select TOP X Filter by the Query End event (EventClass ID = 10) Order by Duration Include the TextData field Include the CurrentTime field To identify shorter running queries that run a lot and burn CPU Order by Duration Descending GROUP BY TextData field SUM by Duration In many situations, a problematic query is identified by a user request – a particular report does not run very well, browsing a cube in a particular way results in very slow performance You can also run a trace to help you identify the longest running queries

10 Determining Storage Engine or Formula Engine
Run a trace while testing the query Sum the duration field for the Subcube (EventClass = 11) This is the amount of time the query spends in the Storage Engine Subtract the sum of the duration field from the duration in the Query Cube End event (EventClass = 10) This is the amount of time the query spends in the Formula Engine Also look at the count of subcube events; many == FE issue Do this both with a cold cache and a warm cache Focus on Storage Engine or Formula Engine issues depending on: Query duration is <30% SE means you focus on FE. Query duration is >= 30% SE means you first need to rule out SE before focusing FE Basic information – is more time being spent in the Storage Engine or the Formula Engine – which one is causing the slowdown?

11 Testing the Parts of the Query
Continued Comment out all script commands in the MDX script If it's slow with the full MDX script but fast with an empty MDX script, then use MDX Script Performance Analyzer to identify the problem MDX script assignment: It may be necessary to start from a stripped down cube, with all of the following removed: The MDX script, except for the Calculate command All unary operators All measure expressions All custom member formulas All Semi-additive measures All Custom Rollup properties If on SQL Server 2008 R2, run Profiler because new trace events to track resource usage and locks If on SQL Server 2012, take a look at the new advanced calculation trace events Note that they are expensive, so avoid turning them on in production! There are many steps in isolating the problem It may be necessary to remove all types of calculations from the cube, to isolate what exactly is causing the slow performance

12 Checking on Partitions and Aggregations
If the query is spending the majority of the time in the Storage Engine, examine the partitions and aggregations Look in the trace for the partitions that the query is hitting to ensure that no unnecessary partitions are being queried For example, if you are querying only 2011 data, the query should not also be hitting 2009 and 2010 partitions In that case, is there something wrong with the MDX? Or is there something wrong in the slice or data within the partition? Look in the trace to see if any aggregations are being used by the query Consider using Query Subcube Verbose to understand more details If no aggregations are being used, consider adding appropriate aggregations for the query Here we just want to mention the situations where one should work on optimizing the partitioning and/or aggregation strategy

13 Common Formula Engine slow downs
Block mode – won’t go into this Lots of discussion in the performance guide Stop by the SQL Clinic if you have questions Rewriting crossjoins to put hierarchies from the same dimension together Non Empty Removing arbitrary sets Not overwriting an empty cell with a non-null value

14 Putting Hierarchies From the Same Dimension Together
Sometimes you see: [Product Category] * [Year] * [Product Subcategory] * [Product] Why not do this instead? [Product Category] * [Product Subcategory] * [Product] * [Year] Advantages? Auto-Exists can optimize the latter pattern to query each dimension independently If the server has to do “cross dimensional auto-exists”, huge performance hit Particularly if there is an arbitrary shape in the subselect A different algorithm, which can be much more expensive Performance optimizations in R2 and 2012 to improve this, but it still helps On occasion it’s unavoidable Usually it’s a matter of educating users of Excel to “first drag all the product attributes, and then drag Years” Or better, try to hide the attribute hierarchies and only give them a user hierarchy

15 Non Empty Related to auto exists Two algorithms: Issues:
Naïve and Express Trace events to show which algorithm Issues: Non Empty can be expensive Not using Non Empty is not feasible Watch out for query patterns: Non Empty Arbitrary shapes in subselect Calculations Causes Non Empty to first apply auto-exists, and then apply Non Empty Applying auto-exists can be *very* expensive

16 Arbitrary Shapes An arbitrary shape is any set that cannot be expressed as a crossjoin of members from the same level of an attribute hierarchy Arbitrary shapes can only use the query cache if they are used in a subselect, in the WHERE clause, or in a calculated member Arbitrary Set: {(Food, USA), (Drink, Canada)} {customer.geography.USA, customer.geography.[British Columbia]}. Note that an arbitrary shape on the query axis does not limit the use of any cache

17 Processing Pitfalls (Dimensions)
SELECT DISTINCT … Do you have appropriate indexes in the relational database? Are you using the fastest provider? Wherever possible, use a native provider (except…) Have you optimized data access? E.g. Network Packet Size? Attribute relationships? The “key” attribute is the most expensive part of dimension processing Can you optimize it? Consider/test processing with ByTable (versus ByAttribute) Note that I don’t actually recommend it, but it can make a positive difference in some scenarios

18 Processing Pitfalls (Dimensions)
Why is ProcessUpdate so slow!!!?? It has to fetch *all* the data in the dimension Check which members already exist, update them (track if they changed) If a new member shows up, add it It has to invalidate indexes/aggregations in partitions that reference it It has to rebuild indexes/aggregations Optimizations? Rigid attribute relationships help this, but hurt flexibility Discussion…

19 Processing Pitfalls (Partitions)
Processing in parallel MaxActiveConnections Enough threads? Speed of reading data How fast is your data source? Speed of looking up dimension keys – too many dimensions? Building indexes and aggregations Quota limitations DataStorePageSize? NUMA? Reduce aggregations? Number and size… Reduce indexes?

20 Tuning AS for High End Hardware (1)
What is your scenario? Throughput? Processing window? Query response time? All of the above? Sigh… Lots of effort went into this area for SSAS 2012 Fruits of it also show up in the 2008 R2 Operations/Performance guides Windows Heap instead of AS Heap For machines with lots of cores and active threads, Windows heap gives better perf AS heap is faster for lower concurrency scenarios Windows Heap is the new default in SSAS 2012 DataStorePageSize/DataStoreHashPageSize Pages allocated from the 8 KB page pool show contention For large results (e.g. processing aggregations/indexes), this can hurt significantly Using 64 KB avoids this contention In SSAS 2012, this size is auto-selected – it picks 64 KB for storage engine caches

21 Tuning AS for High End Hardware (2)
NUMA (Non-Uniform Memory Access) High end hardware with lots of cores is usually NUMA Causes slowdowns in performance “I upgraded from 16 cores to 32 cores and my performance is about the same (or worse)” Some work was already done in 2005 An attempt was made to return pages from “nearby memory” Some more performance work has been done to tune this in SSAS 2012 Windows File System Cache is affected by the same problem If a partition file is first scanned on NUMA node 0, the cache pages are allocated from NUMA node 0 memory Later accesses on any other NUMA node will pay a greater price to access this “distant memory” SSAS 2012 added a new I/O Process thread pool The I/O Process thread pool handles all segment scans In fact, under the covers there are multiple I/O Process thread pools created – one per NUMA node (**) A partition chooses one of those thread pools and will try to stick with it – thus attempting to avoid the problem ** Caveats exist – default is split the thread pools only if there are >= 4 NUMA nodes

22 Tuning AS for High End Hardware (3)
Other bottlenecks Can turn off Resource Monitoring with a configuration setting The results of some DMVs will become invalid (no error is raised) Can configure Analysis Services to open files in RANDOM mode This changes the way Windows handles the pages in the file system cache It can generate some good performance gains on high end hardware Risk: only do it if the machine has enough memory – the setting will cause pages to stay in memory longer and the system can start choking

23 Additional Resources Analysis Services Maestros training
Analysis Services Performance Guide Analysis Services Operations Guide SQL CAT team

24 ! ?

25


Download ppt "Performance Investigations with Analysis Services 2012"

Similar presentations


Ads by Google