Java crash due to native memory allocation map failure, despite n

Question

A Java 11 application is crashing in a manner that from my understanding is impossible with the settings I have.

The application in question runs on Amazon Linux 2, utilizing Java 11. The server is a cloud EC2 with 4 GB of ram. The server has no swap space.

This server is purely dedicated to this application, there should be nothing running on other than the application, things required by the application (such as ngnix), and things that monitor the application.

In addition, the application starts with the Xmx and Xms arguments set to the same value: 2136M, so the JVM should not be requesting memory from the OS, except at startup time.

The application tends to use about 250MB of memory under standard loads, and about 400-500M under "unusually high" loads. This includes the overhead from the servlet container and such. The assignment of over 2 gigs of memory is to provide a small buffer in the event of a DDOS attempt.

The application crashes after having ran for some time, usually at least 24 hours.
From my understanding, this sort of crash should be impossible, given that Xmx and Xms are set the same, so the JVM shouldn't be requesting more memory from the OS once started up.

Here are some extracts from err_pid_###.log

#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 65536 bytes for committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   The process is running with CompressedOops enabled, and the Java Heap may be blocking the growth of the native heap
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
#   JVM is running with Zero Based Compressed Oops mode in which the Java heap is
#     placed in the first 32GB address space. The Java Heap base address is the
#     maximum limit for the native heap growth. Please use -XX:HeapBaseMinAddress
#     to set the Java Heap base and to place the Java Heap above 32GB virtual address.
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2709), pid=2100, tid=2113
#
# JRE version: OpenJDK Runtime Environment (11.0+28) (build 11+28)
# Java VM: OpenJDK 64-Bit Server VM (11+28, mixed mode, tiered, compressed oops, g1 gc, linux-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

---------------  S U M M A R Y ------------

Command Line: -Xmx2136M -Xms2136M -javaagent:/opt/jetty/newrelic/newrelic.jar -Djetty.home=/opt/jetty -Djetty.base=/opt/jetty-base -Djava.io.tmpdir=/tmp /opt/jetty/start.jar jetty.state=/opt/jetty-base/jetty.state jetty-started.xml start-log-file=/{REDACTED}.log

Host: Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz, 2 cores, 3G, Amazon Linux release 2 (Karoo)
Time: Tue Apr  9 16:22:38 2019 UTC elapsed time: 132340 seconds (1d 12h 45m 40s)

---------------  T H R E A D  ---------------

Current thread (0x00007f6f4884c800):  JavaThread "C2 CompilerThread0" daemon [_thread_in_vm, id=2113, stack(0x00007f6f01330000,0x00007f6f01431000)]


Current CompileTask:
C2:132341001 41817       4       com.newrelic.agent.deps.org.apache.http.impl.client.HttpClientBuilder::build (1754 bytes)

---------------  S Y S T E M  ---------------

OS:Amazon Linux release 2 (Karoo)
uname:Linux 4.14.104-95.84.amzn2.x86_64 #1 SMP Sat Mar 2 00:40:20 UTC 2019 x86_64
libc:glibc 2.26 NPTL 2.26
rlimit: STACK 8192k, CORE 0k, NPROC 4096, NOFILE 4096, AS infinity, DATA infinity, FSIZE infinity
load average:0.00 0.03 0.00

/proc/meminfo:
MemTotal:        3978224 kB
MemFree:          103460 kB
MemAvailable:          0 kB
Buffers:               0 kB
Cached:             3744 kB
SwapCached:            0 kB
Active:          3795996 kB
Inactive:           2028 kB
Active(anon):    3794792 kB
Inactive(anon):      224 kB
Active(file):       1204 kB
Inactive(file):     1804 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                36 kB
Writeback:             0 kB
AnonPages:       3794564 kB
Mapped:             2504 kB
Shmem:               452 kB
Slab:              32256 kB
SReclaimable:      13572 kB
SUnreclaim:        18684 kB
KernelStack:        3104 kB
PageTables:        12828 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:     1989112 kB
Committed_AS:    2910992 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      126952 kB
DirectMap2M:     4005888 kB
DirectMap1G:           0 kB


/proc/sys/kernel/threads-max (system-wide limit on the number of threads):
30799


/proc/sys/vm/max_map_count (maximum number of memory map areas a process may have):
65530


/proc/sys/kernel/pid_max (system-wide limit on number of process identifiers):
32768


container (cgroup) information:
container_type: cgroupv1
cpu_cpuset_cpus: 0-1
cpu_memory_nodes: 0
active_processor_count: 2
cpu_quota: -1
cpu_period: 100000
cpu_shares: -1
memory_limit_in_bytes: -1
memory_and_swap_limit_in_bytes: -1
memory_soft_limit_in_bytes: -1
memory_usage_in_bytes: 3889819648
memory_max_usage_in_bytes: 0


CPU:total 2 (initial active 2) (2 cores per cpu, 2 threads per core) family 6 model 85 stepping 4, cmov, cx8, fxsr, mmx, sse, sse2, sse3, ssse3, sse4.1, sse4.2, popcnt, avx, avx2, aes, clmul, erms, rtm, 3dnowpref, lzcnt, ht, tsc, tscinvbit, bmi1, bmi2, adx, fma
CPU Model and flags from /proc/cpuinfo:
model name      : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke

Memory: 4k page, physical 3978224k(103460k free), swap 0k(0k free)

vm_info: OpenJDK 64-Bit Server VM (11+28) for linux-amd64 JRE (11+28), built on Aug 22 2018 18:55:06 by "mach5one" with gcc 7.3.0

END.

score 0 · Answer 1 · answered Feb 04 '20 at 15:26

After fully analyzing what happened,

The key here is -Xmx and -Xms being set to the same value, meaning the JVM will not allocate more stack. So the failure has to be for allocating something OTHER than standard stack memory.

It seems that there was a java library (Conscrypt) which utilized native methods. One of the native methods had a memory leak.

Due to the memory leak being in a native method, it was not bound by the -Xmx and -Xms settings.

score 0 · Answer 2 · answered Apr 10 '19 at 00:00

mmap failures mean the (Linux) kernel failed to allocate memory. Often in a out of memory condition. It does not mean that a (JVM) process exceeded its memory limits.

things required by the application (such as ngnix), and things that monitor the application

How much memory are these using? You can measure precisely by isolating them in their own cgroups (container, systemd slice).

The JVM error gave you a list of solutions, work them.

"Reduce memory load" means look at everything on the host to evaluate memory consumption. /proc/meminfo there does show most of your nearly 4 GB as active anonymous pages. Definitely not 2 GB free.
Java heap options are listed earlier because they tend to be the largest and most commonly tuned. You've looked at them already, but consider reducing heap to make space for other things on this host.
The rest of the Java tuning is a bit more exotic. But it is worth remembering that not all Java memory usage is heap.
Maybe just throw more memory at it, if you do not want to spend time optimizing now.

> mmap failures mean the (Linux) kernel failed to allocate memory. Often in a out of memory condition. It does not mean that a (JVM) process exceeded its memory limits. But how can this happen when Xmx and Xms are set to the same value? The JVM shouldn't be requesting memory be allocated except at startup time. — Michael Long, Apr 10 '19 at 12:00
@MichaelLong Because JVM is **not** the process exceeding the memory limits, as the quote you just pasted says. — Joe, Apr 10 '19 at 14:50
Review memory utilization for *everything on the host*. Start with a process list sorted by resident set size: `top -o RES` — John Mahowald, Apr 10 '19 at 16:45

Java crash due to native memory allocation map failure, despite n

2 Answers2