Module 5 Planning for SQL Server® 2008 R2 Indexing
Module Overview Core Indexing Concepts Data Types and Indexes Single Column and Composite Indexes
Lesson 1: Core Indexing Concepts How SQL Server Accesses Data The Need for Indexes Index Structures Selectivity, Density and Index Depth Index Fragmentation Demonstration 1A: Viewing Index Fragmentation
How SQL Server Accesses Data Index SQL Server reads all table pages SQL Server uses index pages to find rows Table Scan
The Need for Indexes ANSI SQL does not mention indexes Generally considered to be external to the logical data model All queries can be executed without indexes Primary reason for indexes is performance Some constraints are implemented via indexes Indexes are used to make constraints efficient but in theory could be implemented in other ways Analogy: Physical library holding books Index by author is useful Additional indexes would also be useful
Index Structures Indexes are commonly based on tree structures Not just a binary tree as nodes can have more than two children Top node is called the root node Bottom level nodes are called leaf nodes
Selectivity, Density and Index Depth Three core concepts when working with indexes Selectivity A measure of how many rows are returned compared to the total number of rows High selectivity means a small number of rows when related to the total number of rows Density A measure of the lack of uniqueness of data in the table High density indicates a large number of duplicates Index Depth Number of levels within the index Common misconception that indexes are deep
Index Fragmentation How does fragmentation occur? SQL Server reorganizes index pages when data modifications cause index pages to split SQL Server Management Studio – Index Properties System function - sys.dm_db_index_physical_stats Detecting fragmentation Internal – pages are not full External – pages are out of logical sequence Types of fragmentation:
Demonstration 1A: Viewing Index Fragmentation In this demonstration, you will see how to identify fragmented indexes
Lesson 2: Data Types and Indexes Numeric Index Data Character Index Data Date-Related Index Data GUID Index Data BIT Index Data Indexing Computed Columns
Numeric Index Data Indexes with numeric keys work efficiently Many values fit in a small number of index pages Sorts and comparisons are quick Exact numeric types are most efficient Integer types are the most efficient INT and BIGINT commonly used Approximate data types (float and real) much less efficient
Character Index Data Character data types are much less efficient when used in index keys Character values tend to be much larger than numeric values Even short character values are slow to compare unless binary comparisons are being made Most SQL Server applications use collations other than binary Rules for collations need to be applied whenever comparisons are made
Date-Related Index Data Date data types are generally good candidates for index keys Very commonly used in business applications Only slightly less efficient than integer data Size of the data will be important date more efficient than datetime
GUID Index Data Becoming very common in new business applications Moderate efficiency Size is 128 bits or 16 bytes Comparison performance is reasonable Problems arise Randomness of generation causes fragmentation problems Very common problem in many current applications
BIT Index Data BIT columns have only two possible values BIT columns are efficient as index keys Common misconception that BIT columns are not useful in indexes Many valid usage scenarios exist Filtered indexes often useful with BIT column indexes
Indexing Computed Columns Computed columns are based on expressions Values are typically derived from other columns Indexing the computed value rather than any underlying values can be useful Can assist with improving performance on poorly-designed databases Example: a column that is used to hold values that should have been held in separate columns Persisted computed columns Avoid the overhead of computing values each time they are selected Are computed during INSERT and UPDATE statements
Lesson 3: Single Column and Composite Indexes Single Column vs. Composite Indexes Ascending vs. Descending Indexes Index Statistics Demonstration 3A: Viewing Index Statistics
Single Column vs. Composite Indexes Indexes are not always constructed on a single column Multi-column indexes are called "composite" indexes Composite indexes are often useful Tend to be more useful than single column indexes in most typical business applications Having an index sorted first by customer then by order date makes it very easy to find orders for a particular customer on a particular date. A query might involve multiple search predicates. Two columns together might be selective while neither is selective on its own. Index on A,B is not the same as an index on B,A Typically index the most restrictive column first
Ascending vs. Descending Indexes Indexes could be constructed in ascending or descending order In general, for single column indexes, both are equally useful Each layer of a SQL Server index is double-linked (ie: linked in both directions) SQL Server can start at either end and work towards the other end Each component of a composite index can be ascending or descending Might be very useful for avoiding sort operations
Index Statistics SQL Server needs to have knowledge of the layout of the data in a table or index before it optimizes and executes queries Needs to create a reasonable plan for executing the query Important to know the usefulness of each index Selectivity is the most important metric By default, SQL Server automatically creates statistics on indexes Can be disabled Recommendation is to leave auto-creation and auto-update enabled
Demonstration 3A: Viewing Index Statistics In this demonstration, you will see how to work with index statistics
Exercise 1: Explore existing index statistics Challenge Exercise 2: Design column orders for indexes (Only if time permits) Lab 5: Planning for SQL Server Indexing Logon information Estimated time: 45 minutes
Lab Scenario You have been asked to explain the concept of index statistics and selectivity to a new developer. You will explore the statistics available on an existing index and determine how selective some sample queries would be. One of the company developers has provided you with a list of the most important queries that will be executed by the new marketing management system. Depending upon how much time you have available, you need to determine the best column orders for indexes to support each query. Complete as many as possible within the allocated time. In later modules, you will consider how these indexes would be implemented. Each query is to be considered in isolation in this exercise.
Lab Review Which types of queries would most likely lead to widely- differing query plans? If you have an equality predicate and a LIKE predicate in your most important query, which predicate would you try to satisfy as the first column of a composite index?
Module Review and Takeaways Review Questions Best Practices