Where'd all my memory go? Joshua Miller SCALE 12x – 22 FEB 2014
Computers have memory, which they use to run applications. The Incomplete Story Computers have memory, which they use to run applications.
Cruel Reality swap caches buffers shared virtual resident more...
Topics Memory basics Paging, swapping, caches, buffers Overcommit Filesystem cache Kernel caches and buffers Shared memory
top is awesome top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3149296k used, 709396k free, 261556k buffers Swap: 0k total, 0k used, 0k free, 1081832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client 8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be 8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss 7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd 7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd
top is awesome Physical memory used and free Swap used and free top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3149296k used, 709396k free, 261556k buffers Swap: 0k total, 0k used, 0k free, 1081832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client 8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be 8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss 7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd 7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd Physical memory used and free Swap used and free
top is awesome Percentage of RES/total memory top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3149296k used, 709396k free, 261556k buffers Swap: 0k total, 0k used, 0k free, 1081832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client 8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be 8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss 7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd 7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd Percentage of RES/total memory Per-process breakdown of virtual, resident, and shared memory
top is awesome Kernel buffers and caches (no association with swap, top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3149296k used, 709396k free, 261556k buffers Swap: 0k total, 0k used, 0k free, 1081832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client 8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be 8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss 7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd 7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd Kernel buffers and caches (no association with swap, despite being on the same row)
/proc/meminfo [jmiller@meminfo]$ cat /proc/meminfo MemTotal: 3858692 kB MemFree: 3445624 kB Buffers: 19092 kB Cached: 128288 kB SwapCached: 0 kB ...
/proc/meminfo [jmiller@meminfo]$ cat /proc/meminfo MemTotal: 3858692 kB MemFree: 3445624 kB Buffers: 19092 kB Cached: 128288 kB SwapCached: 0 kB ... Many useful values which we'll refer to throughout the presentation
Overcommit top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00 Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3075728k used, 782964k free, 283648k buffers Swap: 0k total, 0k used, 0k free, 1073320k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat
Overcommit top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00 Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3075728k used, 782964k free, 283648k buffers Swap: 0k total, 0k used, 0k free, 1073320k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat 4G of physical memory and no swap , so how can “bloat” have 18.6g virtual?
Virtual memory is not “physical memory plus swap” Overcommit top - 14:57:44 up 137 days, 7:02, 6 users, load average: 0.03, 0.02, 0.00 Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3075728k used, 782964k free, 283648k buffers Swap: 0k total, 0k used, 0k free, 1073320k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22385 jmiller 20 0 18.6g 572 308 S 0.0 0.0 0:00.00 bloat 4G of physical memory and no swap , so how can “bloat” have 18.6g virtual? Virtual memory is not “physical memory plus swap” A process can request huge amounts of memory, but it isn't mapped to “real memory” until actually referenced
Linux filesystem caching Free memory is used to cache filesystem contents. Over time systems can appear to be out of memory because all of the free memory is used for cache.
top is awesome About 25% of this system's memory is from page cache top - 15:57:33 up 131 days, 8:02, 3 users, load average: 0.00, 0.00, 0.00 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.3%sy, 0.3%ni, 99.0%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 3149296k used, 709396k free, 261556k buffers Swap: 0k total, 0k used, 0k free, 1081832k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8131 root 30 10 243m 50m 3748 S 0.0 1.3 0:51.97 chef-client 8153 root 30 10 238m 19m 7840 S 0.0 0.5 1:35.48 sssd_be 8154 root 30 10 208m 15m 14m S 0.0 0.4 0:08.03 sssd_nss 7767 root 30 10 50704 8748 1328 S 1.0 0.2 1559:39 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:06.29 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 48:20.28 snmpd 7026 root 20 0 113m 3992 3032 S 0.0 0.1 0:00.02 sshd About 25% of this system's memory is from page cache
Linux filesystem caching Additions and removals from the cache are transparent to applications Tunable through swappiness Can be dropped - echo 1 > /proc/sys/vm/drop_caches Under memory pressure, memory is freed automatically* *usually
Where'd my memory go? top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd
Where'd my memory go? 1.5G used top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd 1.5G used
Where'd my memory go? ... 1.5G used - 106MB RSS top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd ... 1.5G used - 106MB RSS
Where'd my memory go? ... 1.5G used - 106MB RSS - 345MB cache - top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd ... 1.5G used - 106MB RSS - 345MB cache - 25MB buffer
Where'd my memory go? ... 1.5G used - 106MB RSS - 345MB cache - top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd ... 1.5G used - 106MB RSS - 345MB cache - 25MB buffer = ~1GB mystery What is consuming a GB of memory?
kernel slab cache The kernel uses free memory for its own caches. Some include: dentries (directory cache) inodes buffers
kernel slab cache [jmiller@mem-mystery ~]$ slabtop -o -s c Active / Total Objects (% used) : 2461101 / 2468646 (99.7%) Active / Total Slabs (% used) : 259584 / 259586 (100.0%) Active / Total Caches (% used) : 104 / 187 (55.6%) Active / Total Size (% used) : 835570.40K / 836494.74K (99.9%) Minimum / Average / Maximum Object : 0.02K / 0.34K / 4096.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 624114 624112 99% 1.02K 208038 3 832152K nfs_inode_cache 631680 631656 99% 0.19K 31584 20 126336K dentry 649826 649744 99% 0.06K 11014 59 44056K size-64 494816 494803 99% 0.03K 4418 112 17672K size-32 186 186 100% 32.12K 186 1 11904K kmem_cache 4206 4193 99% 0.58K 701 6 2804K inode_cache 6707 6163 91% 0.20K 353 19 1412K vm_area_struct 2296 2290 99% 0.55K 328 7 1312K radix_tree_node
kernel slab cache 1057MB of kernel slab cache [jmiller@mem-mystery ~]$ slabtop -o -s c Active / Total Objects (% used) : 2461101 / 2468646 (99.7%) Active / Total Slabs (% used) : 259584 / 259586 (100.0%) Active / Total Caches (% used) : 104 / 187 (55.6%) Active / Total Size (% used) : 835570.40K / 836494.74K (99.9%) Minimum / Average / Maximum Object : 0.02K / 0.34K / 4096.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 624114 624112 99% 1.02K 208038 3 832152K nfs_inode_cache 631680 631656 99% 0.19K 31584 20 126336K dentry 649826 649744 99% 0.06K 11014 59 44056K size-64 494816 494803 99% 0.03K 4418 112 17672K size-32 186 186 100% 32.12K 186 1 11904K kmem_cache 4206 4193 99% 0.58K 701 6 2804K inode_cache 6707 6163 91% 0.20K 353 19 1412K vm_area_struct 2296 2290 99% 0.55K 328 7 1312K radix_tree_node 1057MB of kernel slab cache
Where'd my memory go? ... 1.5G used - 106MB RSS - 345MB cache - top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd ... 1.5G used - 106MB RSS - 345MB cache - 25MB buffer = ~1GB mystery What is consuming a GB of memory?
Where'd my memory go? ... 1.5G used - 106MB RSS - 345MB cache - top - 16:40:53 up 137 days, 8:45, 5 users, load average: 0.88, 0.82, 0.46 Tasks: 138 total, 1 running, 137 sleeping, 0 stopped, 0 zombie Cpu0 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1549480k used, 2309212k free, 25804k buffers Swap: 0k total, 0k used, 0k free, 344280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28285 root 30 10 238m 17m 6128 S 0.0 0.5 1:39.42 sssd_be 7767 root 30 10 50704 8732 1312 S 0.0 0.2 1659:37 munin-asyncd 7511 root 30 10 140m 7344 580 S 0.0 0.2 13:56.68 munin-node 3379 root 20 0 192m 4116 652 S 0.0 0.1 50:31.44 snmpd ... 1.5G used - 106MB RSS - 345MB cache - 25MB buffer = ~1GB mystery What is consuming a GB of memory? Answer: kernel slab cache → 1057MB
kernel slab cache Additions and removals from the cache are transparent to applications Tunable through procs vfs_cache_pressure Under memory pressure, memory is freed automatically* *usually
kernel slab cache network buffers example [jmiller@mem-mystery2 ~]$ slabtop -s c -o Active / Total Objects (% used) : 2953761 / 2971022 (99.4%) Active / Total Slabs (% used) : 413496 / 413496 (100.0%) Active / Total Caches (% used) : 106 / 188 (56.4%) Active / Total Size (% used) : 1633033.85K / 1635633.87K (99.8%) Minimum / Average / Maximum Object : 0.02K / 0.55K / 4096.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 1270200 1270170 99% 1.00K 317550 4 1270200K size-1024 1269480 1269406 99% 0.25K 84632 15 338528K skbuff_head_cache 325857 325746 99% 0.06K 5523 59 22092K size-64
kernel slab cache network buffers example [jmiller@mem-mystery2 ~]$ slabtop -s c -o Active / Total Objects (% used) : 2953761 / 2971022 (99.4%) Active / Total Slabs (% used) : 413496 / 413496 (100.0%) Active / Total Caches (% used) : 106 / 188 (56.4%) Active / Total Size (% used) : 1633033.85K / 1635633.87K (99.8%) Minimum / Average / Maximum Object : 0.02K / 0.55K / 4096.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 1270200 1270170 99% 1.00K 317550 4 1270200K size-1024 1269480 1269406 99% 0.25K 84632 15 338528K skbuff_head_cache 325857 325746 99% 0.06K 5523 59 22092K size-64 ~1.5G used , this time for in-use network buffers (SO_RCVBUF)
Unreclaimable slab [jmiller@mem-mystery2 ~]$ grep -A 2 ^Slab /proc/meminfo Slab: 1663820 kB SReclaimable: 9900 kB SUnreclaim: 1653920 kB
Unreclaimable slab [jmiller@mem-mystery2 ~]$ grep -A 2 ^Slab /proc/meminfo Slab: 1663820 kB SReclaimable: 9900 kB SUnreclaim: 1653920 kB Some slab objects can't be reclaimed, and memory pressure won't automatically free the resources
Nitpick Accounting Now we can account for all memory utilization: [jmiller@postgres ~]$ ./memory_explain.sh "free" buffers (MB) : 277 "free" caches (MB) : 4650 "slabtop" memory (MB) : 109.699 "ps" resident process memory (MB) : 366.508 "free" used memory (MB) : 5291 buffers+caches+slab+rss (MB) : 5403.207 difference (MB) : -112.207
Nitpick Accounting Now we can account for all memory utilization: [jmiller@postgres ~]$ ./memory_explain.sh "free" buffers (MB) : 277 "free" caches (MB) : 4650 "slabtop" memory (MB) : 109.699 "ps" resident process memory (MB) : 366.508 "free" used memory (MB) : 5291 buffers+caches+slab+rss (MB) : 5403.207 difference (MB) : -112.207 But sometimes we're using more memory than we're using?!
And a cache complication... top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01 Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 7673860k total, 6895008k used, 778852k free, 300388k buffers Swap: 0k total, 0k used, 0k free, 6179780k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres
And a cache complication... top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01 Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 7673860k total, 6895008k used, 778852k free, 300388k buffers Swap: 0k total, 0k used, 0k free, 6179780k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres ~7G used
And a cache complication... top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01 Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 7673860k total, 6895008k used, 778852k free, 300388k buffers Swap: 0k total, 0k used, 0k free, 6179780k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres ~7G used , ~6G cached ,
And a cache complication... top - 12:37:01 up 66 days, 23:38, 3 users, load average: 0.08, 0.02, 0.01 Tasks: 188 total, 1 running, 187 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 0.6%sy, 0.0%ni, 98.9%id, 0.1%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 7673860k total, 6895008k used, 778852k free, 300388k buffers Swap: 0k total, 0k used, 0k free, 6179780k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2189 postgres 20 0 5313m 2.8g 2.8g S 0.0 38.5 7:09.20 postgres ~7G used , ~6G cached , so how can postgres have 2.8G resident?
Shared memory Pages that multiple processes can access Resident, shared, and in the page cache Not subject to cache flush shmget() mmap()
Shared memory shmget() example
Shared memory shmget() top - 21:08:20 up 147 days, 13:12, 9 users, load average: 0.03, 0.04, 0.00 Tasks: 150 total, 1 running, 149 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 1.5%sy, 0.4%ni, 96.7%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1114512k used, 2744180k free, 412k buffers Swap: 0k total, 0k used, 0k free, 931652k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20599 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:06.52 share
Shared memory shmget() top - 21:08:20 up 147 days, 13:12, 9 users, load average: 0.03, 0.04, 0.00 Tasks: 150 total, 1 running, 149 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 1.5%sy, 0.4%ni, 96.7%id, 1.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1114512k used, 2744180k free, 412k buffers Swap: 0k total, 0k used, 0k free, 931652k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20599 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:06.52 share Shared memory is in the page cache!
Shared memory shmget() top - 21:21:29 up 147 days, 13:25, 9 users, load average: 0.34, 0.18, 0.06 Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.6%sy, 0.4%ni, 98.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1099756k used, 2758936k free, 844k buffers Swap: 0k total, 0k used, 0k free, 914408k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22058 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:05.00 share 22059 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.35 share 22060 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.40 share 3x processes, but same resource utilization - about 1GB
Shared memory shmget() top - 21:21:29 up 147 days, 13:25, 9 users, load average: 0.34, 0.18, 0.06 Tasks: 151 total, 1 running, 150 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.6%sy, 0.4%ni, 98.9%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1099756k used, 2758936k free, 844k buffers Swap: 0k total, 0k used, 0k free, 914408k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 22058 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:05.00 share 22059 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.35 share 22060 jmiller 20 0 884m 881m 881m S 0.0 23.4 0:03.40 share From /proc/meminfo: Mapped: 912156 kB Shmem: 902068 kB
Shared memory mmap() example
Shared memory mmap() From /proc/meminfo: Mapped: 1380664 kB top - 21:46:04 up 147 days, 13:50, 10 users, load average: 0.24, 0.21, 0.11 Tasks: 152 total, 1 running, 151 sleeping, 0 stopped, 0 zombie Cpu(s): 0.3%us, 1.6%sy, 0.2%ni, 94.9%id, 3.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1648992k used, 2209700k free, 3048k buffers Swap: 0k total, 0k used, 0k free, 1385724k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24569 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:03.04 mapped From /proc/meminfo: Mapped: 1380664 kB Shmem: 212 kB
Shared memory mmap() From /proc/meminfo: Mapped: 1380664 kB top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10 Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3858692k total, 1659936k used, 2198756k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 1385732k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped 24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped 24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped From /proc/meminfo: Mapped: 1380664 kB Shmem: 212 kB
Shared memory mmap() From /proc/meminfo: Mapped: 1380664 kB top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10 Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3858692k total, 1659936k used, 2198756k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 1385732k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped 24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped 24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped Not counted as shared, but mapped From /proc/meminfo: Mapped: 1380664 kB Shmem: 212 kB
Shared memory mmap() From /proc/meminfo: Mapped: 1380664 kB top - 21:48:06 up 147 days, 13:52, 10 users, load average: 0.21, 0.18, 0.10 Tasks: 154 total, 1 running, 153 sleeping, 0 stopped, 0 zombie Cpu(s): 0.2%us, 0.7%sy, 0.2%ni, 98.8%id, 0.0%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 3858692k total, 1659936k used, 2198756k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 1385732k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24592 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.26 mapped 24586 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.28 mapped 24599 jmiller 20 0 2674m 1.3g 1.3g S 0.0 35.4 0:01.29 mapped 105%! From /proc/meminfo: Mapped: 1380664 kB Shmem: 212 kB
A subtle difference between shmget() and mmap()...
Locked shared memory Memory from shmget() must be explicitly released by a shmctl(..., IPC_RMID, …) call Process termination doesn't free the memory Not the case for mmap()
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd ~900M of cache
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd 'echo 3 > /proc/sys/vm/drop_caches' – no impact on value of cache, so it's not filesystem caching
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd Processes consuming way less than ~900M
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd From /proc/meminfo: Mapped: 27796 kB Shmem: 902044 kB
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd Un-attached shared mem segment(s) From /proc/meminfo: Mapped: 27796 kB Shmem: 902044 kB
Locked shared memory shmget() top - 11:36:35 up 151 days, 3:41, 3 users, load average: 0.09, 0.10, 0.03 Tasks: 129 total, 1 running, 128 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.4%sy, 0.4%ni, 99.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3858692k total, 1142248k used, 2716444k free, 3248k buffers Swap: 0k total, 0k used, 0k free, 934360k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24376 root 30 10 253m 60m 3724 S 0.0 1.6 0:35.84 chef-client 24399 root 30 10 208m 15m 14m S 0.0 0.4 0:03.22 sssd_nss 7767 root 30 10 50704 8736 1312 S 1.0 0.2 1886:38 munin-asyncd Observable through 'ipcs -a' From /proc/meminfo: Mapped: 27796 kB Shmem: 902044 kB
Accounting for shared memory is difficult top reports memory that can be shared – but might not be ps doesn't account for shared pmap splits mapped vs shared, reports allocated vs used mmap'd files are shared, until modified → at which point they're private
Linux filesystem cache What's inside? Do you need it? ? detritus? /etc/motd? Important app data?
Linux filesystem cache We know shared memory is in the page cache, which we can largely understand through proc From /proc/meminfo: Cached: 367924 kB ... Mapped: 31752 kB Shmem: 196 kB
Linux filesystem cache We know shared memory is in the page cache, which we can largely understand through proc From /proc/meminfo: Cached: 367924 kB ... Mapped: 31752 kB Shmem: 196 kB But what about the rest of what's in the cache?
Linux filesystem cache Bad news: We can't just ask “What's in the cache?” Good news: We can ask “Is this file in the cache?”
linux-ftools https://code.google.com/p/linux-ftools/ [jmiller@cache ~]$ linux-fincore /tmp/big filename size cached_pages cached_size cached_perc -------- ---- ------------ ----------- ----------- /tmp/big 4,194,304 0 0 0.00 --- total cached size: 0
linux-ftools https://code.google.com/p/linux-ftools/ [jmiller@cache ~]$ linux-fincore /tmp/big filename size cached_pages cached_size cached_perc -------- ---- ------------ ----------- ----------- /tmp/big 4,194,304 0 0 0.00 --- total cached size: 0 Zero % cached
linux-ftools https://code.google.com/p/linux-ftools/ [jmiller@cache ~]$ linux-fincore /tmp/big filename size cached_pages cached_size cached_perc -------- ---- ------------ ----------- ----------- /tmp/big 4,194,304 0 0 0.00 --- total cached size: 0 [jmiller@cache ~]$ dd if=/tmp/big of=/dev/null bs=1k count=50 Read ~5%
linux-ftools https://code.google.com/p/linux-ftools/ [jmiller@cache ~]$ linux-fincore /tmp/big filename size cached_pages cached_size cached_perc -------- ---- ------------ ----------- ----------- /tmp/big 4,194,304 0 0 0.00 --- total cached size: 0 [jmiller@cache ~]$ dd if=/tmp/big of=/dev/null bs=1k count=50 /tmp/big 4,194,304 60 245,760 5.86 total cached size: 245,760 ~5% cached
system tap – cache hits https://sourceware [jmiller@stap ~]$ sudo stap /tmp/cachehit.stap Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00%
Track reads against VFS, reads against disk, then infer cache hits system tap – cache hits https://sourceware.org/systemtap/wiki/WSCacheHitRate [jmiller@stap ~]$ sudo stap /tmp/cachehit.stap Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00% Track reads against VFS, reads against disk, then infer cache hits
But – have to account for LVM, device mapper, remote disk system tap – cache hits [jmiller@stap ~]$ sudo stap /tmp/cachehit.stap Cache Reads (KB) Disk Reads (KB) Miss Rate Hit Rate 508236 24056 4.51% 95.48% 0 43600 100.00% 0.00% 0 59512 100.00% 0.00% 686012 30624 4.27% 95.72% 468788 0 0.00% 100.00% 17000 63256 78.81% 21.18% 0 67232 100.00% 0.00% 0 19992 100.00% 0.00% But – have to account for LVM, device mapper, remote disk devices (NFS, iSCSI ), ...
Easy mode - drop_caches echo 1 | sudo tee /proc/sys/vm/drop_caches frees clean cache pages immediately frequently accessed files should be re-cached quickly performance impact while caches repopulated
Filesystem cache contents No ability to easily see full contents of cache mincore() - but have to check every file Hard - system tap / dtrace inference Easy – drop_caches and observe impact
Memory: The Big Picture Virtual memory Swap Physical memory
Physical Memory
Physical Memory Free
Physical Memory Used Free
Physical Memory Used Private application memory Free
Physical Memory Used Kernel caches (SLAB) Private application memory Free
Physical Memory Used Kernel caches (SLAB) Buffer cache (block IO) Private application memory Free
Physical Memory Used Kernel caches (SLAB) Buffer cache (block IO) Private application memory Page cache Free
Physical Memory Used Kernel caches (SLAB) Buffer cache (block IO) Private application memory Page cache Filesystem cache Free
Physical Memory Used Kernel caches (SLAB) Buffer cache (block IO) Private application memory Page cache Shared memory Filesystem cache Free
Physical Memory Used Kernel caches (SLAB) Buffer cache (block IO) Private application memory Page cache Shared memory Filesystem cache Free
Thanks! Send feedback to me: joshuamiller01 on gmail