Presentation is loading. Please wait.

Presentation is loading. Please wait.

Performance.

Similar presentations


Presentation on theme: "Performance."— Presentation transcript:

1 Performance

2 What Is Performance? Fast!? “Performance” is meaningless in isolation
It only makes sense to talk about performance of a particular workload, and according to a particular set of metrics Queries per second in Web server Response latency in DNS server Number of delivery per second in Mail server

3 Computer Components

4 Factors That Affect Performance
Four major resources CPU Memory Hard disk I/O bandwidth Network I/O bandwidth Where is the real bottleneck Not CPU, hard disk bandwidth it is !! Memory size has a major influence on performance When memory is not enough, system will do swap, so memory and disk bandwidth are the major suspects

5 What is your system doing?
CPU use Disk I/O Network I/O Application (mis-)configuration Hardware limitations System calls and interaction with the kernel Multithreaded lock contention Typically one or more of these elements will be the limiting factor in performance of your workload Monitoring interval should not be too small

6 Analyzing CPU usage – (1)
Three information of CPU Load average Overall utilization Help to identify whether the CPU resource is the system bottleneck Per-process consumption Identify specific process’s CPU utilization

7 Analyzing CPU usage – (2)
Load average The average number of runnable processes Including processes waiting for disk or network I/O uptime(1) Show how long system has been running and the load average of the system over the last 1, 5, and 15 minutes $ uptime 2:34PM up 500 days, 57 mins, 1 user, load averages: 0.05, 0.14, 0.16

8 Analyzing CPU usage – (3)
vmstat(8) – Overall utilization procs r(in running queue)、b(block for resources)、w(runnable) cpu us(user time): high computation sy(system time): processes make lots of system call or perform I/O id: cpu idle faults (average per second over last 5 seconds) in: device interrupt per interval sy: system calls per interval cs: cpu context switch rate $ vmstat -c 2 -w 5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad6 ad in sy cs us sy id M 5491M M 5491M

9 Analyzing CPU usage – (4)
Examples Idle server Busy server $ vmstat -c2 -w5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id M 1066M M 1066M $ vmstat -c5 -w5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad4 da0 in sy cs us sy id M 232M M 199M M 114M M 203M M 177M

10 Analyzing CPU usage – (5)
Per-process consumption top command Display and update information about the top cpu processes ps command Show process status

11 Analyzing CPU usage – (6)
top(1) – Show a realtime overview of what CPU do Paging to/from swap Performance kiss of death Spending lots of time in the kernel, or processing interrupts Which processes/threads are using CPU What they are doing inside the kernel Eg. Biord/biowr/wdrain: disk I/O sbwait: waiting for socket input ucond/umtx: waiting on an application thread lock Many more Only documented in the source code Context switchs Voluntary: process blocks waiting for a resource Involuntary: kernel decides the process should stop running for now

12 Analyzing CPU usage – (7)
last pid: 68356; load averages: 0.15, 0.11, up 42+03:58:29 14:48:49 75 processes: 2 running, 73 sleeping CPU: 5.5% user, 0.0% nice, 0.2% system, 0.2% interrupt, 94.1% idle Mem: 435M Active, 852M Inact, 549M Wired, 18M Cache, 210M Buf, 126M Free Swap: 2048M Total, 45M Used, 2003M Free, 2% Inuse PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 68108 vscan M 68652K select : % perl 68249 vscan M 63292K select : % perl 68127 vscan M 68352K lockf : % perl 68351 liuyh K 2228K CPU : % top 837 idlst K 5488K select : % perl5.8.9 629 root K 800K select : % syslogd 29270 postgrey K 8064K select : % perl5.12.4 913 root K 1096K kqread : % master 657 root K 1936K select : % ypserv 926 postfix K 3452K kqread : % qmgr 29129 clamav K 3352K pause : % freshclam 29845 vscan M 24012K select : % perl 17889 postfix K 2572K kqread : % anvil 35953 liuyh K 796K kqread : % tail 688 root K 8772K select : % amd 766 root K 1512K select : % ntpd -H: threads S: kernel a: process titles Process summery (process, CPU, MEM, Swap) Number of threads STATE: process state

13 Analyzing CPU usage – (8)
last pid: 68553; load averages: 0.39, 0.23, up 42+04:08:33 14:58:53 74 processes: 1 running, 73 sleeping CPU: 4.9% user, 0.0% nice, 0.5% system, 0.1% interrupt, 94.6% idle Mem: 417M Active, 851M Inact, 553M Wired, 18M Cache, 210M Buf, 142M Free Swap: 2048M Total, 45M Used, 2003M Free, 2% Inuse PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 68521 vscan % perl 68249 vscan % perl 68304 vscan % perl 68522 vscan % perl 68520 vscan % perl 837 idlst % perl5.8.9 629 root % syslogd 29270 postgrey % perl5.12.4 913 root % master 657 root % ypserv 926 postfix % qmgr 29129 clamav % freshclam 29845 vscan % perl 17889 postfix % anvil 35953 liuyh % tail 688 root % amd 766 root % ntpd -m: switch cpu/io mode

14 Analyzing memory usage – (1)
When memory is not enough … Memory page has to be “swapped out” to the disk block LRU (Least Recently Used) algorithm Bad situation – “desperation swapping” Kernel forcibly swaps out runnable process Extreme memory shortage Two numbers that quantify memory activity Total amount of active virtual memory Tell you the total demand for memory Page rate suggest the proportion of actively used memory

15 Analyzing memory usage – (2)
To see amount of swap space in use pstat -s / swapinfo (FreeBSD) pstat(8): % pstat -s $ pstat -s Device K-blocks Used Avail Capacity /dev/label/swap % /dev/label/swap % Total %

16 Analyzing memory usage – (3)
vmstat memory avm: active virtual pages fre: size of the free list page (averaged each five seconds, given in units per second) flt: total number of page faults re: page reclaims pi: pages paged in po: pages paged out 50 page-out cause about 1 seconds latency fr: pages freed per second sr: pages scanned by clock algorithm, per-second $ vmstat -c 2 -w 5 procs memory page disks faults cpu r b w avm fre flt re pi po fr sr ad6 ad in sy cs us sy id M 5491M M 5491M

17 Analyzing memory usage – (4)
last pid: 68356; load averages: 0.15, 0.11, up 42+03:58:29 14:48:49 75 processes: 2 running, 73 sleeping CPU: 5.5% user, 0.0% nice, 0.2% system, 0.2% interrupt, 94.1% idle Mem: 435M Active, 852M Inact, 549M Wired, 18M Cache, 210M Buf, 126M Free Swap: 2048M Total, 45M Used, 2003M Free, 2% Inuse PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 68108 vscan M 68652K select : % perl 68249 vscan M 63292K select : % perl 68127 vscan M 68352K lockf : % perl 68351 liuyh K 2228K CPU : % top 837 idlst K 5488K select : % perl5.8.9 629 root K 800K select : % syslogd 29270 postgrey K 8064K select : % perl5.12.4 913 root K 1096K kqread : % master 657 root K 1936K select : % ypserv 926 postfix K 3452K kqread : % qmgr 29129 clamav K 3352K pause : % freshclam 29845 vscan M 24012K select : % perl 17889 postfix K 2572K kqread : % anvil 35953 liuyh K 796K kqread : % tail 688 root K 8772K select : % amd 766 root K 1512K select : % ntpd Size: address space use RES: resident memory (RAM)

18 Analyzing disk I/O – (1) For disk-intensive workloads
Limited by bandwidth or latency(response time for an I/O operation) Random access reads/writes requires the disk to constantly seek Limiting throughput Sequential I/O is limited by the transfer rate of the disk and controller

19 Analyzing disk I/O – (2) iostat(8) Report I/O statistics
tin/tout: characters read from /write to terminal KB/t: kilobytes per transfer tps: transfers per second MB/s: megabytes per second cpu ni: % of cpu time in user mode running niced processes in: % of cpu time in interrupt mode $ iostat –c5 -w5 tty da da cpu tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id

20 Analyzing disk I/O – (3) gstat(8)
High latency is the most obvious sign of an overloaded disk dT: 0.017s w: 1.000s L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name | acd0 | da0 | da1 | da0s1 | da1s1 | da0s1a | da0s1b | da0s1c | da0s1d | da1s1a | da1s1b | da1s1c | da1s1d | mirror/gm0s1a ops/s: queued ops r/s: read/sec kBps: through put ms/r: latency

21 Analyzing disk I/O – (4) Not currently supported by ZFS
last pid: 68553; load averages: 0.39, 0.23, up 42+04:08:33 14:58:53 74 processes: 1 running, 73 sleeping CPU: 4.9% user, 0.0% nice, 0.5% system, 0.1% interrupt, 94.6% idle Mem: 417M Active, 851M Inact, 553M Wired, 18M Cache, 210M Buf, 142M Free Swap: 2048M Total, 45M Used, 2003M Free, 2% Inuse PID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 68521 vscan % perl 68249 vscan % perl 68304 vscan % perl 68522 vscan % perl 68520 vscan % perl 837 idlst % perl5.8.9 629 root % syslogd 29270 postgrey % perl5.12.4 913 root % master 657 root % ypserv 926 postfix % qmgr 29129 clamav % freshclam 29845 vscan % perl 17889 postfix % anvil 35953 liuyh % tail 688 root % amd 766 root % ntpd Not currently supported by ZFS

22 Tuning disk performance – (1)
Reduce disk contention Move competing I/O jobs onto independent disks Stripe multiple disks with gstripe For filesystems striped across multiple disks Make sure the filesystem boundary is stripe-aligned Eg. For 64k stripe sizes, start of filesystem should be 64k-aligned to avoid splitting I/O between multiple stripes Add more/better hardware

23 Tuning disk performance – (2)
Try to restructure the workload to separate “critical” data and “scratch” data Scratch data can be reconstructed or discarded after a crash Can afford to use fast but less reliable storage options mount –o async is fast but unsafe Go one step further: store temporary data in memory mdconfig –a –t swap –s 4g; mount –o async ... Creates a “swap-backed” memory device Swap only used when memory is low, otherwise stored in RAM

24 Analyzing network – (1) netstat(1) – show network status
Also list listening sockets netstat –a Show the state of all interfaces netstat –i Examining the routing table netstat –rn $ netstat -i Name Mtu Network Address Ipkts Ierrs Opkts Oerrs Coll bge <Link#1> :14:5e:0a:04: bge csmx bge1* <Link#2> :14:5e:0a:04: lo <Link#3> lo your-net localhost $ netstat -rn Routing tables Internet: Destination Gateway Flags Refs Use Netif Expire default UGS bge0 UH lo0 /24 link# UC bge0

25 Analyzing network – (2) netstat(1) Show network traffic
netstat -w 3 Display system‐wide statistics for each network protocol netstat –s $ netstat -w 3 input (Total) output packets errs bytes packets errs bytes colls

26 Analyzing network – (3) sockstat(1) – list open sockets
USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS postfix smtp stream private/smtp postfix smtp dgram -> /var/run/logpriv postfix smtp udp4 *: *:* vscan perl tcp : *:* vscan perl udp4 *: *:* vscan perl tcp : :50823 vscan perl udp : :53 postfix smtpd tcp4 *: *:* postfix smtpd dgram -> /var/run/logpriv postfix smtpd udp4 *: *:* root master tcp4 *: *:* root master stream public/cleanup root master stream private/tlsmgr root master stream private/rewrite root sshd tcp4 *: *:*

27 Tuning network performance
Check packet loss and protocol negotiation Socket buffer too small? kern.ipc.maxsockbuf – maximum socket buffer size setsockopt(…, SO_{RCV, SND}BUF, …) net.inet.udp.recvspace UDP will drop packets if the receive buffer fills TCP largely self-tuning Check for hardware problems

28 Display system statistics
systat(1) systat –display refresh-interval 1 users Load Dec 20 17:34 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act count All k pages Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt cow total zfod mpt0 irq28 ozfod atkbd0 1 2.0%Sys 0.3%Intr 11.9%User 0.0%Nice 85.8%Idle %ozfod ata0 irq14 | | | | | | | | | | | daefr bge0 bge1 =>>>>>> prcfr cpu0: time 390 dtbuf totfr cpu1: time Namei Name-cache Dir-cache desvn react cpu2: time Calls hits % hits % numvn pdwak cpu3: time frevn pdpgs intrn Disks da0 da wire KB/t act tps inact MB/s cache %busy free 99888 buf icmp, icmp6, ifstat, iostat, ip, ip6, mbufs, netstat, pigs, swap, tcp, or vmstat.

29 When to throw hardware at the problem
Only once you have determined that a particular hardware resource is your limiting factor More CPU cores will not solve a slow disk Adding RAM can reduce the need for some disk I/O More cached data, less paging from disk Adding more CPU cores is not a magic bullet for CPU limited jobs Some applications do not scale well High CPU can be caused by resource contention Recourse contention will make performance worse!!!

30 *stat commands $ ls -la {,/usr}{/bin,/sbin}/*stat
-r-xr-xr-x 1 root wheel Sep 1 18:06 /sbin/ipfstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /sbin/kldstat* -r-xr-sr-x 1 root kmem Sep 1 18:06 /usr/bin/btsockstat* -r-xr-sr-x 1 root kmem Sep 1 18:06 /usr/bin/fstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/ministat* -r-xr-sr-x 1 root kmem Sep 1 18:06 /usr/bin/netstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/nfsstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/procstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/sockstat* -r-xr-xr-x 2 root wheel Sep 1 18:06 /usr/bin/stat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/systat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/bin/vmstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/sbin/gstat* lrwxr-xr-x 1 root wheel Sep 1 18:06 -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/sbin/ifmcstat* -r-xr-xr-x 1 root wheel Sep 1 18:06 /usr/sbin/iostat* -r-xr-xr-x 1 root wheel Sep 1 18:05 /usr/sbin/lockstat* -r-xr-xr-x 1 root wheel Sep 1 18:07 /usr/sbin/pmcstat* -r-xr-xr-x 2 root wheel Sep 1 18:07 /usr/sbin/pstat* lrwxr-xr-x 1 root wheel Sep 1 18:06

31 Further Reading Help! My system is slow!
Further topics Device I/O Tracing system calls used by individual process Activity inside the kernel Lock profiling Sleepqueue profiling Hardware performance counters Kernel turing Benchmarking techniques ministat(1)


Download ppt "Performance."

Similar presentations


Ads by Google