2

Running Ubuntu 12.04.3 LTS with 32 cores 244GB. Its the Amazon EC2 memory instance the big one and Java 1.7u25

My java process is running with -Xmx226g

I'm trying to create a really large local cache using CQEngine and so far its blazingly fast with 30,000,000 records. Of course I will add an eviction policy that will allow garbage collection to clean up old objects evicted, but really trying to push the limits here :)

When looking at jvisualvm, the total heap is at about 180GB which dies 40GB to soon. I should be able to squeeze out a bit more.

Not that I don't want the kernel to kill a process if it runs out of resources but I think it's killing it to early and want to squeeze the mem usage as much as possible.

The ulimit output is as follows...

ubuntu@ip-10-156-243-111:/var/log$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 1967992
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1967992
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

The kern.log output is...

340 total pagecache pages
0 pages in swap cache
Swap cache stats: add 0, delete 0, find 0/0
Free swap  = 0kB
Total swap = 0kB
63999984 pages RAM
1022168 pages reserved
649 pages shared
62830686 pages non-shared
[ pid ]   uid  tgid total_vm      rss cpu oom_adj oom_score_adj name
[  505]     0   505     4342       93   9       0             0 upstart-udev-br
[  507]     0   507     5456      198   2     -17         -1000 udevd
[  642]     0   642     5402      155  28     -17         -1000 udevd
[  643]     0   643     5402      155  29     -17         -1000 udevd
[  739]     0   739     3798       49  10       0             0 upstart-socket-
[  775]     0   775     1817      124  25       0             0 dhclient3
[  897]     0   897    12509      152  10     -17         -1000 sshd
[  949]   101   949    63430       91   9       0             0 rsyslogd
[  990]   102   990     5985       90   8       0             0 dbus-daemon
[ 1017]     0  1017     3627       40   9       0             0 getty
[ 1024]     0  1024     3627       41  10       0             0 getty
[ 1029]     0  1029     3627       43   6       0             0 getty
[ 1030]     0  1030     3627       41   3       0             0 getty
[ 1032]     0  1032     3627       41   1       0             0 getty
[ 1035]     0  1035     1083       34   1       0             0 acpid
[ 1036]     0  1036     4779       49   5       0             0 cron
[ 1037]     0  1037     4228       40   8       0             0 atd
[ 1045]     0  1045     3996       57   3       0             0 irqbalance
[ 1084]     0  1084     3627       43   2       0             0 getty
[ 1085]     0  1085     3189       39  11       0             0 getty
[ 1087]   103  1087    46916      300   0       0             0 whoopsie
[ 1159]     0  1159    20490      215   0       0             0 sshd
[ 1162]     0  1162  1063575      263  15       0             0 console-kit-dae
[ 1229]     0  1229    46648      153   4       0             0 polkitd
[ 1318]  1000  1318    20490      211  10       0             0 sshd
[ 1319]  1000  1319     6240     1448   1       0             0 bash
[ 1816]  1000  1816 70102543 62010032   4       0             0 java
[ 1947]     0  1947    20490      214   6       0             0 sshd
[ 2035]  1000  2035    20490      210   0       0             0 sshd
[ 2036]  1000  2036     6238     1444  13       0             0 bash
[ 2179]  1000  2179    13262      463   2       0             0 vi
Out of memory: Kill process 1816 (java) score 987 or sacrifice child
Killed process 1816 (java) total-vm:280410172kB, anon-rss:248040128kB, file-rss:0kB

The kern.log clearly states that it killed my process because it ran out of memory. But like I said I think I can squeeze it a bit more. Is there any settings I need to do to allow me to use the 226GB allocated to JAVA.

user432024
  • 273
  • 3
  • 14
  • The amount of memory that process was utilizing was 248040128Kb at the time of being killed, irrespective of what other utilities say the kernel is accounting for that much of the memory being used by that process. Also it may be useful to post more of the OOM output so I can see what node usage was. – Matthew Ife Dec 28 '13 at 18:03
  • How do I get more of the oom output? – user432024 Dec 28 '13 at 20:27
  • There should be more logs from dmesg related to the out of memory event. – Matthew Ife Dec 28 '13 at 20:27
  • Ok I guess I need to reproduce the issue again! I'll do it next couple days! :) – user432024 Dec 28 '13 at 21:19
  • as @MIfe told memory used was 248040128Kb and if you look the output "Free swap = 0kB Total swap = 0kB" in few words, your java app was using all ram memory and swap – c4f4t0r Dec 28 '13 at 22:50
  • So in reality even if I have 244GB and I allocated to java -Xmx226g it's taking up more then required? How much overhead does java need? – user432024 Dec 29 '13 at 03:02

0 Answers0