Introduction to Databases Chapter 8: Improving Data Access
Performance Roadblocks Improving database access: –Goals: Easier access of data Improve performance during data access Performance bottlenecks: –Hardware platform Because DBMSs are resource-intensive, hardware issues are common source of performance problems Database server issues Network issues
Performance Roadblocks Performance bottlenecks (cont.): –Database Performance during data reads and writes directly impacts application performance Normalization can impair read performance Indexes can impair write performance –Application Poorly written, inefficient application can result in poor performance even if hardware and database server are working at optimum levels
Hardware Performance: Disk Drive Issues Disk access –How quickly hard disk can process read and write requests –Time required to locate and transfer data Disk queue Disk read and write statistics
Hardware Performance: Disk Drive Issues Solutions: –Upgrade hardware More drives Faster drives RAID –Reduce load Split load between multiple drives Increase memory and cache Use dedicated server
Hardware Performance: Memory Issues Most DBMSs set aside memory to cache data; some may cache queries and executables To solve memory issues: –Reduce load on system memory –Increase system memory Virtual memory paging file: –Used by modern operating systems to increase available memory –Paging increases with insufficient system memory and may interfere with other disk operations
Cache Memory
Virtual Memory
Hardware Performance: Processor Issues Programs, queries, modules all put loads on processor To determine if processor is issue, check processor free and busy time Solutions: –Reduce processor load –Install faster processor –Install multiple processors Processor affinity: –Configurable option for many DBMSs (e.g. SQL Server) –Controls how, or if, database server makes use of multiple processors
Performance Monitoring Tools DBMS specific utilities include: –SQL Server Profiler: Captures database activity for analysis –SQL Server Database Tuning Engine: Analyzes system activity and reports on index use –Execution plan: Identifies how query processor resolves query, specific steps involved and resources required; showing which indexes are used and how they are used by query
Performance Monitoring Tools Operating system tools: –Windows System/Performance Monitor: Provides real-time performance information –Hardware Properties dialog boxes Configure hardware devices
Windows System Monitor System Monitor (previously called Performance Monitor): –Used to collect and log performance data Performance data logs: –Used to give performance baseline –Allow detailed analysis Performance counters: Monitor specific system activity Alerts: Monitor one or more performance counters and send notification when counter reaches specified threshold value SQL Server: Installs large set of SQL Server- specific performance objects and counters
Windows System Monitor
Performance Counters
Working with Indexes Clustered indexes: –Impose physical organization on table –Sorting table rows in index order DBMSs that support clustered indexes have limit of one clustered index
Working with Indexes Nonclustered indexes: –Provide logical organization –Organizing data according to key columns –Do not affect physical table order Number of nonclustered indexes allowed is DBMS-specific
Designing Indexes Considerations in designing indexes include: – Total number of indexes – Defining key columns (columns that set index sort order) – Unnecessary indexing Small tables Columns with low selectivity (few unique values) – Disk space and write performance
Designing Indexes Columns that benefit from indexing include: – Primary key columns – Foreign key columns – Columns commonly referenced in queries WHERE clause search logic GROUP BY or ORDER BY clauses SELECT column lists
Creating Indexes CREATE INDEX syntax (SQL Server Transact-SQL): CREATE [CLUSTERED] | [NONCLUSTERED] [UNIQUE] INDEX name ON object (column_list) INCLUDE (column_list) Object: Typically table, but in some DBMSs, may be view
Creating Indexes Additional options are available for managing free space, user access, etc. ALTER INDEX: –To modify index –You cannot use ALTER INDEX to change index key and nonkey columns DROP INDEX: –To delete index
How Queries Use Indexes When SQL query optimizer is presented with new SELECT statement to evaluate, it seeks out information in database metadata about tables named in FROM clause: –Which columns are used as index keys –Which columns have unique values –How many rows each table has
How Queries Use Indexes Query optimizer uses information about tables, together with various components of SELECT statement itself, to find most efficient way to retrieve data required by query and satisfy any joins When extensive join operation can’t be avoided, query optimizer selects one of several join algorithms, such as: –Nested loop join –Merge-scan join
Working with Views Views: –Provide access to selected data –Used to: Restrict access to underlying base objects Simplify security management Join tables to provide easy access to denormalized data –Can have one or more base tables –May be based on other views –Syntax: CREATE VIEW name AS select_statement
Working with Views To manage views: –ALTER VIEW: Modifies view definition –DROP VIEW: Deletes view Can be specified as destination when running INSERT or UPDATE, with restrictions Indexed view: –View with clustered index –View result is persisted through index structure
Working with Views Example: Creating view for New York customers CREATE VIEW v_ny_cust AS SELECT * FROM CUSTOMERS WHERE HQCITY = ‘New York’ To retrieve data from view: SELECT * FROM v_ny_cust
Programmable Objects Custom database objects programmed by developer Based on: –SQL language commands and variables –Control statements (if supported by DBMS) Procedures and functions: Most common programmable objects
Programmable Objects Procedure: –Set of executable statements that supports input and output parameters Function: –Similar to system functions –Accepts input parameters and returns specific type of value, either scalar or table value Trigger: –Specialized procedure associated with specific objects and events
Procedures Simplify and speed performance of periodic or complicated activities Ensures activities are performed consistently Aid system security: –Placing level of isolation between users and database May be: –DBMS defined system procedures (system stored procedures) E.g. sp_helpdb (SQL Server) returns information about specified database or all system databases –Custom, user stored
Procedures: sp_helpdb
Creating Procedures Syntax: CREATE PROC[EDURE] procedure name [parameter_list] AS Sql_statements Defining parameters: –Must supply parameter name and data type –Can specify default values –Must specify output parameter for procedure to return value
Creating Procedures Example: CHAR(3) INT = ‘137’ OUTPUT SELECT ‘The count for salesperson’ + ‘is’ + AS CHAR(2)
Functions User defined functions (UDFs): –Return value (unlike procedures) –Used instead of procedures when returned value needed Three types of functions (SQL Server 2005) –Scalar function: Returns scalar value of specified type –In-line table-valued function: Returns table generated as result of single SELECT statement (similar to view) –Multistatement table-valued function: Returns table generated from multiple Transact-SQL statements
Creating Functions Transact-SQL CREATE FUNCTION command: Slightly different syntax versions for function type For scalar function: CREATE FUNCTION name ([parameter_list]) RETURNS data_type [AS] BEGIN Sql_statements RETURN scalar_value END
Creating Functions Example scalar function: CREATE FUNCTION fn_CountCust CHAR(3)) RETURNS INT AS BEGIN INT = (SELECT COUNT(*) FROM CUSTOMER WHERE END To run: SELECT dbo.fn_CountCust(‘137’)
Creating Functions Syntax for in-line table-valued function CREATE FUNCTION name ([parameter_list]) RETURNS TABLE [AS] RETURN (select_statement)
Creating Functions Example in-line table-valued function : CREATE FUNCTION fn_GetCust () RETURNS TABLE RETURN (SELECT SPNAME, CUSTNAME, HQCITY FROM SALESPERSON JOIN CUSTOMER ON (CUSTOMER.SPNUM = SALESPERSON.SPNUM)) To run: SELECT * FROM dbo.fn_GetCust
Creating Functions Syntax for multi-statement table-valued function CREATE FUNCTION name ([parameter_list]) TABLE (table_definition) [AS] BEGIN Sql_statements RETURN END
Summary Performance bottlenecks may be caused by hardware platform issues, database issues, and application issues. Hardware issues (disk drives, memory, processors) may be resolved by upgrading component or reducing load on component Performance monitoring tools include DBMS specific utilities (SQL Server Profiler and Database Tuning engine) and operating system utilities (Windows Performance Monitor, hardware Properties dialog boxes) Index performance considerations: Effects on disk space and write performance. Columns that benefit from indexing: Primary key columns, foreign key columns, and columns commonly referenced in queries Views provide restricted access to underlying base objects Procedures and functions: Speed and simplify periodic or complicated activities; aid system security by placing level of isolation between users and database.
Key Terms Alert Bottleneck Cache Dedicated server Disk queue Function Indexed view Input parameter Key column Merged scan join Nested loop join Nested procedures Output parameter Paging file Performance baseline Performance counter Performance object Procedure Processor affinity Schema binding System stored procedures Threshold value
Copyright Notice Copyright 2008 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Requests for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information herein.