MAKE YOUR QUERIES FASTER Optimization Optimization MAKE YOUR QUERIES FASTER MATT MARTIN SQL Saturday Atlanta June 2017
About Enjoy Big Data projects and query tuning 9 years experience MATT MARTIN | CHIEF TECHNOLOGY OFFICER | SQLETL Twitter: @sqletl_matt Website: www.sqletl.com Email: matt@sqletl.com 9 years experience SQL Server, SSIS, EDIS, and Big Data Enjoy Big Data projects and query tuning
Agenda Mega Chainsaws Corp. Cool Stuff this course will teach you Company Sales Database Overview Tables involved in Sales Query Identify bottlenecks Restructure query to run faster Q & A Cool Stuff this course will teach you How to generate random data of any size/type How to size all tables in a database How to read an execution plan and identify opportunities for performance gains
1. Mega Chainsaws Corp Time to Generate some data USE MEGA_CHAINSAWS GO exec USP_LOAD_DUMMY_DATA_TO_DB
1. Mega Chainsaws Corp Tip 1: Know your data size USE MEGA_CHAINSAWS Table Name Data Description CUST_ID Customer information (names) DPT Departments INVC_DTL Invoice line detail INVC_HDR Invoice header INVC_STAT_CD Invoice status PO_DTL Purchase order detail PO_HDR Purchase order header PO_STAT_CD PO status code PO_SUB_STAT_CD PO sub status code SKU Product sku/description/retail SKU_STAT_CD SKU status (active/clearance/inactive) SLS_REP_ID Sales Rep name SLS_REP_STAT_CD Employee Status Code (Active/Term) Tip 1: Know your data size USE MEGA_CHAINSAWS GO exec usp_get_tbl_sizes
2. Examine Slow Query Over a dozen table joins Clustered Indexes on larger tables do not align (PO_DTL and INVC_DTL) LIKE statement in the WHERE clause multiple times, combined with OR
2. Bottlenecks to look for SELECTS in the WHERE clause Range Joins Like Operators Non-covering indexes Clustered Indexes not aligning and forcing a hash join on large tables AND IHDR.INVC_DT = (SELECT MAX(INVC_DT) FROM INVC_HDR WHERE INVC_ID = IHDR.INVC_ID) AND IHDR.INVC_DT BETWEEN POH.ORD_DT AND DATEADD(DAY,90, POH.ORD_DT) OR lower(C.CUST_NM) LIKE '%alvarez%'
3. Current Execution Plan Biggest Performance Killers Hash Matches Full Table Scans (no index)
3. Optimize this Query Break your query into smaller chunks Leverage SELECT INTO statements (They are fast) Use temp tables to pre-process data
Walk through new Stored Proc 3. Optimize this Query Walk through new Stored Proc
4. Take-aways Do not fall in the trap of optimization to death. Is it really worth an extra 3-4 hours of work to shave off another 5 seconds on a query? Know your data size. Small tables are not worth your time to optimize queries against. Odds are it will run a full table scan anyways due to diminished returns on optimization Check your clustered indexes for header/detail relational tables. If you can leverage an Identity seed as the clustered index for the header and detail, almost always do it. Why? - The optimizer has to compare 1 column to merge the two tables against and its an integer so it will be fast
Optimization Optimization MATT MARTIN CTO | SQLETL Twitter: @sqletl_matt Website: www.sqletl.com Email: matt@sqletl.com