We're in the middle of a project to move our infrastructure from a co-lo situation into Amazon EC2 and we've noticed some weird memory characteristics of the processes in our setup. Without going into too much detail about the specifics of our processes, we've noticed that on our EC2 instances "top" will show processes using a lot of swap space -- in fact, much greater than the amount of available swap or (if you add it all up) more than the available disk.
Here's a sample top output:
Mem: 7136868k total, 5272300k used, 1864568k free, 256876k buffers
Swap: 1048572k total, 0k used, 1048572k free, 2526504k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP COMMAND
4121 jboss 20 0 5913m 603m 14m S 0.7 8.7 3:59.90 5.2g java
22730 root 20 0 2394m 4012 1976 S 2.0 0.1 4:20.57 2.3g PassengerHelper
20564 rails 20 0 2539m 220m 9828 S 0.3 3.2 0:23.58 2.3g java
1423 nscd 20 0 877m 1464 972 S 0.0 0.0 0:03.89 876m nscd
You can see, for instance, that jboss is reportedly using 5.2 gigs of swap space which is definitely impossible since there's only 1G allocated and none is being used (probably because there's still 1.8G of RAM free).
And here's the results of uname -a
:
Linux xxx.yyy.zzz 2.6.35.14-106.53.amzn1.x86_64 #1 SMP Fri Jan 6 16:20:10 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
We're running an AMI based off of the default Amazon Linux AMI (Amazon Linux AMI release 2011.09, so some RHEL5 and RHEL 6) with not too many customizations and definitely no kernel-level customizations.
Something here tells me that on this particular kernel/distribution, the reporting of swap or maybe even total memory usage isn't what it appears to be...
Any help would be appreciated!