Steve Hood SimpleSQLServer.com Indexing Strategy Steve Hood SimpleSQLServer.com Target Audience: Comfortable creating indexes, but know you could do better - Clustered and Nonclustered Indexes ONLY - No In-Memory OLTP - No Columnstore
Entry-Level Blogging Challenge Tim Ford – Entry-Level Contest Write one entry-level post per month Link to Tim’s Post Mention it on twitter #sqlpass and #entrylevel Absolutely perfect for a #NewSQLBlogger!!! If you can’t explain it simply, you don’t understand it well enough. -Albert Einstein Indexing Strategy – @SteveHoodSQL
Steve Hood SimpleSQLServer.com Indexing Strategy Steve Hood SimpleSQLServer.com Target Audience: Comfortable creating indexes, but know you could do better - Clustered and Nonclustered Indexes ONLY - No In-Memory OLTP - No Columnstore
SQL Server Capacities Database Engine Object Maximum Size / Number Clustered indexes per table 1 Nonclustered indexes per table 999 Columns per index key 16 Bytes per index key 900 Nested trigger levels 32 Parameters per stored proc 2,100 Database Size 524,272 TB Indexing Strategy – @SteveHoodSQL
SQL Server Prod OLTP Capacities Database Engine Object Maximum Size / Number Clustered indexes per table 1 Nonclustered indexes per table <5 without peer review Columns per index key <4 without peer review Bytes per index key Narrow / useful balance Nested trigger levels Triggers are how you shoot yourself Parameters per stored proc Be kind Database Size Don’t make me say Petabyte Indexing Strategy – @SteveHoodSQL
Indexing Goal Have as few indexes as possible efficiently referenced by as many pertinent, well-tuned, consistently written queries as is reasonable. Indexing Strategy – @SteveHoodSQL
Single Query Performance? Index for your database, not your query Slow Queries Excessive amount of work Waits on resources (I/O) Query is not tuned Query is not written consistently Query is not an OLTP query Not because the index isn’t “perfect” The strategy is a lot like civil engineers designing a highway system. You’re not worried about a specific point a to point b trip, you’re worried about how many different trips can benefit the most from your design. Indexing Strategy – @SteveHoodSQL
How? Key column order Intelligent use of key lookups Compression Based on reuse, not cardinality Intelligent use of key lookups Eliminate them when excessive Accept them when elimination is excessive Compression Must-use feature, just not everywhere Enterprise Edition Indexing Strategy – @SteveHoodSQL
Key Column Order What predicates do many queries share? Look in the plan cache for similarities Not everything goes into or stays in plan cache Indexing Strategy – @SteveHoodSQL
Key Column Order The index did a descent job Slow Queries Excessive amount of work Waits on resources (I/O) Already in memory because other queries used it, too Indexing Strategy – @SteveHoodSQL
How? Key column order Intelligent use of key lookups Compression Based on reuse, not cardinality Intelligent use of key lookups Eliminate them when excessive Accept them when elimination is excessive Compression Must-use feature, just not everywhere Enterprise Edition Indexing Strategy – @SteveHoodSQL
Key Lookups Can be good Can kill you Nested Loop getting rows from Clustered Index A couple rows is fine A million rows is noticed by users and the buffer pool Indexing Strategy – @SteveHoodSQL
How? Key column order Intelligent use of key lookups Compression Based on reuse, not cardinality Intelligent use of key lookups Eliminate them when excessive Accept them when elimination is excessive Compression Must-use feature, just not everywhere Enterprise Edition Indexing Strategy – @SteveHoodSQL
Compression Costs CPU Saves CPU, I/O, disk space Makes data modifications longer, more expensive Makes data access longer, more expensive Saves CPU, I/O, disk space Makes data modifications faster, cheaper Makes data access faster, cheaper About 25% savings is always worth testing 25% smaller than uncompressed for row-level 25% smaller than row-level for page-level Indexing Strategy – @SteveHoodSQL
Where Do I Start? Pick a Table Most Expensive Queries or Extended Events What tables are read the most by the biggest queries Query the Buffer Pool Largest tables / indexes in memory Index Usage Stats It doesn’t matter how or where you start, as long as you start. When you do, it’s important to only change a couple indexes at once so you can find and rollback issues easier. Indexing Strategy – @SteveHoodSQL
Most Expensive Queries or Xevent Queries that need the most help Still indexing for the database, not the query Run with SET STATISTICS IO ON Which table is read the most What filters are on that table? WHERE clause JOINs What columns are being returned Indexing Strategy – @SteveHoodSQL
Query the Buffer Pool What is using a lot of memory? Often because it’s not using great indexes Points you towards tables and indexes Indexing Strategy – @SteveHoodSQL
Index Usage Stats What large indexes have scans against them? Where are there many key lookups? Indexing Strategy – @SteveHoodSQL
I Found a Table or Index, Now What? Query the Plan Cache Not perfect, not everything is in cache Look at execution plans from large queries not in the cache if you know of any What patterns do you see? Indexing Strategy – @SteveHoodSQL
Patterns in Cache for Key Columns Pick an important or large query Look at equality predicates WHERE X = Y JOIN ON X = Y What else uses some or all of the same equality predicates? Can finish off with a single inequality <>, BETWEEN, >, <, NOT IN Indexing Strategy – @SteveHoodSQL
Index Columns What key column order would help the most number of queries? What other predicates (seek or scan) do the queries filter by? More important to have on the index, even if not a key column Still reducing number of rows Indexing Strategy – @SteveHoodSQL
Covering Index Columns What other columns are commonly used? How many queries use them? Important queries carry more weight How many rows do those queries return? How often are those queries run? How wide are the columns? Is the clustered index in memory anyways? Indexing Strategy – @SteveHoodSQL
Clustered Index “Includes” all columns Key field(s) part of every nonclustered index Implied Key Fields in non-unique NC indexes Implied included columns in unique NC indexes Typically the Primary Key Typically very narrow key fields There are times to make it something else Indexing Strategy – @SteveHoodSQL
Check Compression Does the index compress well? Enterprise Edition Does the index compress well? Compression can help performance Reduction of Logical I/O Reduction of Physical I/O Possible net CPU benefit Indexing Strategy – @SteveHoodSQL
Example Discussion Query Equality Columns First Second 1 A, B, D 2 3 4 5 A, B, C, D 6 B, D, E 7 C, D, E 8 F 9 B 10 B, E Index – First Option B, A E, D Index – Second Option B, A, D I doubt I’ll have time to get to this in an hour-long session. The fewer indexes you can get by with, the better it is for you. If the cardinality of B is good enough, you may be able to use a single index to take care of this. You’ll know that queries 7 and 8 will cause scans, but these are also outliers that you may be able to eliminate through tuning, timing (run them off hours), or moving them to a reporting database. Once you get the key columns down, you can start evaluating the queries that will hit that index. See what other columns they filter by as these will be more important to have in the index as included columns; just because they filter by something doesn’t mean it actually eliminates rows in practice, so look for that. Then evaluate what other columns are being used and see if they should be included columns to avoid key lookups. Remember that you don’t always want to eliminate key lookups, they aren’t necessarily bad. Indexing Strategy – @SteveHoodSQL
Indexed Views / Filtered Indexes Both require all connections to have: Set Option Value ANSI_NULLS ON ANSI_PADDING ANSI_WARNINGS ARITHABORT CONCAT_NULL_YIELDS_NULL NUMERIC_ROUNDABORT OFF QUOTED_IDENTIFIER Indexing Strategy – @SteveHoodSQL
Indexed Views Additional costs Additional benefits Changes to all tables involved can cause updates Additional benefits Can preaggregate data Typically only used on static tables Great in DW where tables are updated daily drop index, run load, create index Need to specify indexed view name in Standard Indexing Strategy – @SteveHoodSQL
Filtered Index Index with a WHERE clause Query’s WHERE clause must match Need to test each query to make sure it works Useless on queries that aren’t filtered Consistency in querying is needed Useful if many queries filter out most records WHERE OrderStatus <> ‘Closed’ Indexing Strategy – @SteveHoodSQL
Steve Hood SimpleSQLServer.com Steve@SimpleSQLServer.com @SteveHoodSQL Indexing Strategy Steve Hood SimpleSQLServer.com Steve@SimpleSQLServer.com Session Eval @SteveHoodSQL Event Eval Indexing Strategy – @SteveHoodSQL