Perf with the Linux Kernel
annotate – annotate source code with profile info perf commands annotate – annotate source code with profile info kmem – kernel memory profiling kvm – profile guests list – list kinds of events lock – analyze lock events record – save profile data to a file report – display profile report stat – gather data while running a command timechart – visualize system behavior top – system profiling
~/perf-stuff$ perf list |grep Hardware cpu-cycles OR cycles [Hardware event] instructions [Hardware event] cache-references [Hardware event] cache-misses [Hardware event] branch-instructions OR branches [Hardware event] branch-misses [Hardware event] bus-cycles [Hardware event] stalled-cycles-frontend OR idle-cycles-frontend [Hardware event] ref-cycles [Hardware event] Much more available if run as root.
~/perf-stuff$ perf stat -e instructions date User space example: ~/perf-stuff$ perf stat -e instructions date Wed Feb 18 21:49:03 PST 2015 Performance counter stats for 'date': 669,408 instructions 0.001060121 seconds time elapsed Subsequent runs gave: 668,858 instructions, 0.001034210 seconds time elapsed 667,037 instructions, 0.000995727 seconds time elapsed
perf calls perf_event_open(2) – kernel/events/core.c perf list/stat kernel sudo perf list perf calls perf_event_open(2) – kernel/events/core.c sudo perf stat -e net:netif_rx wget http://tahoot.com/cs172a-lecture7.mp4 (24MB) Performance counter stats for 'wget http://tahoot.com/cs172a- lecture7.mp4': 2 net:netif_rx 9.536512491 seconds time elapsed For stat -e net:netif_receive_skb 3,828 net:netif_receive_skb 3.821882698 seconds time elapsed int netif_receive_skb(struct sk_buff *skb) { trace_netif_receive_skb_entry(skb); return netif_receive_skb_internal(skb); }
perf_event_paranoid is not mentioned in ./Documentation top run perf top with sudo perf_event_paranoid is not mentioned in ./Documentation defined in kernel/sysctl.c depends on CONFIG_PERF_EVENTS
timechart sudo perf timechart record find /usr -name core >&/dev/null sudo perf timechar display output.svg
sudo perf record wget http://tahoot.com/cs172a-lecture7.mp4 report sudo perf record wget http://tahoot.com/cs172a-lecture7.mp4 sudo perf report
sudo perf lock record wget http://tahoot.com/cs172a-lecture7.mp4 tracepoint lock:lock_acquire is not enabled. Are CONFIG_LOCKDEP and CONFIG_LOCK_STAT enabled? under kernel hacking
perf kvm Perf uses CPU performance counters These need to be virtualized to properly account for performance in a Guest VM perf kvm [--host] [--guest] [--guestmount=<path> [--guestkallsyms=<path> - -guestmodules=<path> | --guestvmlinux=<path>]] {top|record|report|diff|buildid-list} perf kvm [--host] [--guest] [--guestkallsyms=<path> --guestmodules=<path> | --guestvmlinux=<path>] {top|record|report|diff|buildid-list} http://infoscience.epfl.ch/record/162329/files/VEE11_performance_profiling_of_virtual_machines.pdf http://www.linux-kvm.org/page/Perf_events
Virtualization Overhead VM layer overhead, Linux guest: CPU and Memory: 14.36% Network I/O: 24.46% Disk I/O: 8.84% Disk latency for reading: 2.41 times slower Micro-operations execution time: 10.84 times slower Redhat reports 85% efficiency in VM's (RHEL6 Virtualization Getting Started Guide) Reported by: http://petersenna.com/files/peters-top4-virtualization-benchmark-1.28.pdf
sudo perf kmem record wget http://tahoot.com/cs172a-lecture7.mp4 you may lose events... Warning: Processed 6080223 events and lost 10 chunks! Check IO/CPU overload!
kmem events sudo perf record -g -e kmem:mm_page_alloc -c 1 wget http://tahoot.com/cs172a-lecture7.mp4
sudo perf annotate -l -P
annotate in kernel
report of looper with dup2()