Numbers, We don’t need no stinkin’ numbers Adam Backman Vice President DBAppraise, Llc
About the Presenter Progress user from when dinosaurs roamed the earth (nearly) President - White Star Software – Consulting: performance, coding, problem solving – Training: Programming, System and Database Administration Vice President – DBAppraise – Managed database services
Agenda Why performance is important? Components of performance Perception vs. reality Who is most important?
Why is performance important? Time is literally money Many idle hands cost real money A delayed customer is a lost customer Delayed support equals lost confidence
Put a value on performance Users wait 10 seconds Does not sound bad Users do the operation many times per day 10 seconds per transaction, 10 per hour, 8 hours a day. 800 seconds wasted per user per day. 13 minutes wasted per user per day times then number of users (500 users) That is over 100 hours of wasted time per day
Components of performance Network Disk Memory CPU Goal: Push the bottleneck to the fastest resource
Network Slowest resource Temp files going to network drive Need to minimize traffic – -Mm (Remember to increase frame size everywhere when increasing –Mm) – -Mn, -Mpb, -Mi, -Ma
Disk Most frequent offender People focus on wrong metrics Queue depth and service time are generally good indicators of congestion
Memory Move things off disk into memory -B (DB to shared memory) -Bt (temp disk files to temp buffers) OS and Disk array caches
CPU The “right” type of CPU activity – User – what you paid for – System – System overhead – Wait – Waiting on I/O (What type of I/O) – Idle – You need idle but having zero does not mean there is an issue
Numbers are good but … Performance stinks Performance is perception User experience is king
First, look for record locks or other application issues ProTop has a screen for blocked sessions Record locks can completely stop activity The user sees record in use by …. The administrator does not Additionally, I look for very high db requests from a single connection
ProTop: Blocked Sessions Blocked Sessions Usr Name Note tom REC XQH 102 [Order] Adam
Promon: Block Access Block Access: Type Usr Name DB Requests DB Reads BI Reads \Writes \Writes Acc 999 TOTAL Acc 0 adam Acc 5 adam Acc 6 adam Acc 7 dbapprai
Buffer Hit Percentage Generally a good metric But … – A single table small table scan can vastly skew results – Low volume buffer hit percentage is nearly meaningless
How to Make Buffer Hit Rate Useful Know which tables are being read – Large tables – Small tables Know what is “normal”
How to Make Buffer Hit Rate Less Useful Bring up promon activity screen and only use the first sample Use really small sample sizes (seconds vs. minutes) Use really large sample sizes (hours vs. minutes)
Benchmarks Lie Do not test real-world – All read (Readprobe) – All write (ATM) – Wrong mix of read and write Time slicing can make results more attractive
CPU – Wait The CPU always blames everyone else If you have wait and idle it is generally no issue If you have wait and no idle you likely have an issue. Look at disk first
CPU - Idle If you have a single core then a single program can use 100% of the CPU This is a good thing. The process will use it’s CPU and complete
The network is never more than 10% busy Every network admin in the world uses this line They get this from the manufacturers They sample and provide a single sample for a large time frame. How about 100% busy 10% of the time
Setting –spin based on a calculation Gus said that it should be … Unless Gus is at your site any calculation is wrong Gus said this some time ago and was misquoted at that time Generally stated as # * CPUs This is nearly always wrong (you could get lucky by accident)
Percentage full on extents Is it 99% full or 327% full Important to look at allocated (actual growth) versus percentage fill of the last extent Hint it never shows 100% as it preallocates space for future extends of the area
Now we know how people lie but how do I determine if our performance is acceptable?
Ask the users
Method: Measuring Performance Determine your 5-10 most time critical portions of the application Time them in isolation Time them during the day when everyone says performance is OK. They will never say it’s good. These timings should be close if not exactly the same
Method: Determine importance Customer visible Done many (thousands+) times a day Users “wait” for screen/output
Timings Need not be exact Wrist watch or cell phone timer is fine Keep track of these timings When people complain about performance redo the timings
If the timings are bad Look for bottlenecks – Network – Disk – Memory – CPU It will likely be one of the first two solved by using more of the second two
If the timings are good Smack the users around for wasting your time or Reevaluate timings, no really just smack the users
Conclusion Performance is perception – Reason for “working …” Focus on user experience Know what is normal – In stored statistics – In response times
Still more Conclusion Know what is important – Customer facing Benchmarks lie Buffer Hit Rate – You can make it whatever you want – Need to understand how to make it useful
Questions? Adam Backman
Thank you for your time!