8

We were testing our java application and it was killed by the OOM Killer, logs below.

A couple of points: this is a java web application. It runs with a 190G heap. The server is a 240G total memory EC2 instance. We were trying the new, experimental ZGC in JDK 11. At the time the application was not under load, running in a prod-like test env. The application does have some processing that is schedule-triggered and can be memory and CPU intensive, but these run often and have not failed.

The application code has not significantly changed from what we have in production. We have successfully load-tested this as well, including a 12 hour soak test without problem.

Now to the OOM-Killer logs. The full logs are below, but one thing I find very odd is this line:

Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.571217] Killed process 4973 (java) total-vm:17181703924kB, anon-rss:9740kB, file-rss:0kB, shmem-rss:0kB

The total vm size is ~ 17 terabytes. This is bizarre to me, as there's no way our app would ever even try to ask for this normally. I suspect maybe we've encountered some ZGC or ZGC+our code bug, but want to make sure I'm reading this right.

I greatly appreciate any help, full logs for reference:

Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.158902] java invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null),  order=0, oom_score_adj=0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.166556] java cpuset=/ mems_allowed=0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.168975] CPU: 15 PID: 4974 Comm: java Not tainted 4.14.55-62.37.amzn1.x86_64 #1
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.173353] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.177219] Call Trace:
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.178806]  dump_stack+0x5c/0x82
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.180848]  dump_header+0x94/0x21c
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.183087]  ? get_page_from_freelist+0x525/0xba0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.185945]  oom_kill_process+0x213/0x410
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.188518]  out_of_memory+0x296/0x4c0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.190748]  __alloc_pages_slowpath+0x9ef/0xdd0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.193653]  __alloc_pages_nodemask+0x207/0x220
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.196455]  alloc_pages_vma+0x7c/0x1e0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.198924]  shmem_alloc_page+0x65/0xa0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.201277]  ? __radix_tree_insert+0x35/0x1f0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.204022]  ? __vm_enough_memory+0x23/0x130
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.206639]  shmem_alloc_and_acct_page+0x72/0x1b0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.209513]  shmem_getpage_gfp+0x14b/0xb40
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.212102]  ? __set_page_dirty+0xc0/0xc0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.214705]  shmem_fallocate+0x32d/0x440
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.217182]  vfs_fallocate+0x13f/0x260
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.219525]  SyS_fallocate+0x43/0x80
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.221657]  do_syscall_64+0x67/0x100
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.223944]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.227002] RIP: 0033:0x7fcc064ab1f9
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.229295] RSP: 002b:00007fcc071df800 EFLAGS: 00000246 ORIG_RAX: 000000000000011d
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.233706] RAX: ffffffffffffffda RBX: 0000002f80000000 RCX: 00007fcc064ab1f9
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.237896] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.242203] RBP: 00007fcc071df860 R08: 0000000000000005 R09: 0000000000000000
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.246538] R10: 0000002f80000000 R11: 0000000000000246 R12: 0000000000001000
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.250675] R13: 0000002f80000000 R14: 00007fcc00030d18 R15: 0000002f80000000
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.254948] Mem-Info:
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313] active_anon:46337469 inactive_anon:15816944 isolated_anon:0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313]  active_file:362 inactive_file:1424 isolated_file:0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313]  unevictable:0 dirty:87 writeback:0 unstable:0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313]  slab_reclaimable:146831 slab_unreclaimable:12432
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313]  mapped:49788503 shmem:61099090 pagetables:297088 bounce:0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.256313]  free:264623 free_pcp:31 free_cma:0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.276739] Node 0 active_anon:185857908kB inactive_anon:62759744kB active_file:1448kB inactive_file:5696kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:199154012kB dirty:348kB writeback:0kB shmem:244396360kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.293799] Node 0 DMA free:15904kB min:4kB low:16kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.308882] lowmem_reserve[]: 0 1928 245809 245809
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.311710] Node 0 DMA32 free:976000kB min:528kB low:2500kB high:4472kB active_anon:0kB inactive_anon:1017428kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:2080768kB managed:1995880kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.327796] lowmem_reserve[]: 0 0 243881 243881
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.330558] Node 0 Normal free:66588kB min:67048kB low:316780kB high:566512kB active_anon:187190988kB inactive_anon:60409332kB active_file:1448kB inactive_file:5696kB unevictable:0kB writepending:0kB present:253755392kB managed:249734404kB mlocked:0kB kernel_stack:7280kB pagetables:1188352kB bounce:0kB free_pcp:124kB local_pcp:120kB free_cma:0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.348446] lowmem_reserve[]: 0 0 0 0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.350789] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (U) 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15904kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.358404] Node 0 DMA32: 1*4kB (M) 1*8kB (U) 2*16kB (UM) 2*32kB (UE) 1*64kB (U) 1*128kB (U) 2*256kB (UE) 3*512kB (UME) 3*1024kB (UME) 2*2048kB (UM) 236*4096kB (M) = 976172kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.367499] Node 0 Normal: 2527*4kB (UME) 778*8kB (UME) 293*16kB (UME) 104*32kB (UME) 51*64kB (UME) 24*128kB (UME) 5*256kB (UM) 2*512kB (UM) 1*1024kB (U) 3*2048kB (UME) 8*4096kB (M) = 72924kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.377787] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.382919] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.387847] 61100730 total pagecache pages
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.390368] 0 pages in swap cache
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.392264] Swap cache stats: add 0, delete 0, find 0/0
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.395429] Free swap  = 0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.397195] Total swap = 0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.398971] 63963037 pages RAM
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.400873] 0 pages HighMem/MovableOnly
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.403102] 1026490 pages reserved
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.405255] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.410245] [ 2730]     0  2730     2833      194      11       3        0         -1000 udevd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.415382] [ 2848]     0  2848     2812      165      10       3        0         -1000 udevd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.420563] [ 2849]     0  2849     2812      161      10       3        0         -1000 udevd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.425740] [ 3137]     0  3137    27285       57      21       3        0             0 lvmetad
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.430993] [ 3146]     0  3146     6799       48      18       3        0             0 lvmpolld
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.436187] [ 3354]     0  3354     2353      121      10       3        0             0 dhclient
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.441499] [ 3410]     0  3410    13251       98      27       3        0         -1000 auditd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.446792] [ 3431]     0  3431    61878      132      23       3        0             0 rsyslogd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.452140] [ 3451]     0  3451    23291      114      15       3        0             0 irqbalance
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.457538] [ 3461]     0  3461     1630       25       9       3        0             0 rngd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.462533] [ 3478]    81  3478     5461       58      15       3        0             0 dbus-daemon
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.468050] [ 3513]     0  3513     1099       35       8       3        0             0 acpid
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.473222] [ 3609]     0  3609    20134      207      42       3        0         -1000 sshd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.478262] [ 3622]     0  3622    28967      266      60       3        0             0 sshd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.483118] [ 3624]   500  3624    28967      265      59       3        0             0 sshd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.488272] [ 3625]   500  3625    28850      113      13       3        0             0 bash
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.493403] [ 3658]    38  3658    29129      204      29       3        0             0 ntpd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.498490] [ 3702]     0  3702    22395      432      45       3        0             0 sendmail
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.503845] [ 3712]    51  3712    20259      374      40       3        0             0 sendmail
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.509091] [ 3724]     0  3724    30411      149      14       3        0             0 crond
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.514018] [ 3738]     0  3738     4797       42      13       3        0             0 atd
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.518902] [ 3773]     0  3773     1090       24       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.524491] [ 3775]     0  3775     1627       31       9       3        0             0 agetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.529721] [ 3778]     0  3778     1090       24       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.535029] [ 3781]     0  3781     1090       24       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.540245] [ 3783]     0  3783     1090       24       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.545404] [ 3785]     0  3785     1090       24       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.550610] [ 3787]     0  3787     1090       25       8       3        0             0 mingetty
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.556012] [ 4948]   500  4948    28285       60      12       3        0             0 start_server.sh
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.561798] [ 4973]   500  4973 4295425981     2435      71       5        0             0 java
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.566936] Out of memory: Kill process 4973 (java) score 0 or sacrifice child
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.571217] Killed process 4973 (java) total-vm:17181703924kB, anon-rss:9740kB, file-rss:0kB, shmem-rss:0kB
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.578364] oom_reaper: reaped process 4973 (java), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Update: It looks like the 17 TB virtual memory (also visible in top during application run) is not really a problem.

We setup a simple Spring Boot hello world application, with default settings except for ZGC enabled, and the same thing happened. It looks like ZGC allocates 4 TB of address space regardless of heap size, and will also map the same physical address multiple times, so this is, apparently, normal. I can find no one discussing it, which is surprising.

The OOM-Killer appears to be getting involved when we try to shut down the application. Given that we're trying to shut it down anyways, I'm not particularly concerned.

Again, appreciate everyone's assistance.

Taylor
  • 211
  • 2
  • 6
  • Looks like a bug in the logging code? The source code at https://github.com/torvalds/linux/blob/master/mm/oom_kill.c#L862 is what generated the logging line, but the `total-vm` column for your Java process 4973 shows a value of 4295425981 and not the `17181703924kB` shown in the `Killed ...` line. – Andrew Henle Dec 20 '18 at 17:44
  • @Andrew from what I've read, that 4295425981 value is in 4kb pages, which works out to the 17 terabyte figure. – Taylor Dec 20 '18 at 17:46
  • 2
    OK, then I'd say you're interpreting the log entry correctly. Seems as if something in your Java process went a bit over-the-top in trying to allocate memory. Maybe turn off memory overcommit entirely and see what blows up? – Andrew Henle Dec 20 '18 at 17:51

3 Answers3

10

Out of memory.

Dec 18 23:24:59 ip-10-0-3-36 kernel: [ 775.566936] Out of memory: Kill process 4973 (java) score 0 or sacrifice child

From the same log (ps);

[ 775.561798] [ 4973] 500 4973 4295425981 2435 71 5 0 0 java

4295425.981 is around 4TB. and the line total-vm:17181703924kB show around a 17TB.

Can you debug your memory allocation routine ? as for me your application got a bad loop somewhere and must take all the ressource available, and the swap available too.

yagmoth555
  • 16,300
  • 4
  • 26
  • 48
  • Can you elaborate on this: `Can you debug your memory allocation routine ? as for me your application got a bad loop somewhere and must take all the ressource available, and the swap available too.`? – Scott Skiles Oct 22 '20 at 19:49
2
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.214705]  shmem_fallocate+0x32d/0x440
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.217182]  vfs_fallocate+0x13f/0x260
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.219525]  SyS_fallocate+0x43/0x80
Dec 18 23:24:59 ip-10-0-3-36 kernel: [  775.221657]  do_syscall_64+0x67/0x100

Your application process is trying to invoke fallocate on shmem filesystem. From quick googling it looks like ZGC uses fallocate to grab initial heap memory from shm filesystem and proceeds to use fallocate for expanding heap. Such use of fallocate syscall is rather unusual, so either this is a ZGC bug (as you already suspected) or something else is leaking lots of memory, which causes heap expansion to fail.

I suggest, that you configure ZGC to avoid additional runtime allocations (set Xms and Xmx to same value). This might not solve your problem, if the memory leak happens because of something unrelated, but at least you would have a better chance to find the real culprit.

Note, that your overall setup is somewhat dangerous — ZGC apparently likes to have a lot of contiguous memory, but if you have 190G heap on 240G RAM machine, there might not be a sufficiently big contiguous region to fallocate from. In that case ZGC will fall back to picking up small memory regions with further fallocate calls (see description of linked bug report), and the issue will get obscured again... Enable hugepages support in JVM (normal hugepages, not transparent hugepages!) and preallocate hugepages during boot (with kernel argument) — using hugepages is advisable for your heap sizes anyway.

user1643723
  • 121
  • 3
  • Thank you for the great advice. We are looking into hugepages. `Xms` and `Xmx` are the same, and just for context, this is a bare bones ec2 instance, except for java and our app. Anyways, we made some progress, I'll update the question shortly with details. – Taylor Dec 21 '18 at 14:33
2

So the long and short of this turned out to be some native memory leak (i.e. non-heap). We patched the OS and it went from substantial (in that we'd die in a few days) to virtually nil (die in a few months). The OS patch notes had some bug fixes around memory management, but it's a bit too low level for me to grok the details.

There's still a slight native memory leak, but it's no longer urgent. We'll continue to monitor. I'll update this answer if any patches resolve further.

Taylor
  • 211
  • 2
  • 6