Linux: find out what process is using all the RAM?

131

43

Before actually asking, just to be clear: yes, I know about disk cache, and no, it is not my case :) Sorry, for this preamble :)

I'm using CentOS 5. Every application in the system is swapping heavily, and the system is very slow. When I do free -m, here is what I got:

             total       used       free     shared    buffers     cached
Mem:          3952       3929         22          0          1         18
-/+ buffers/cache:       3909         42
Swap:        16383         46      16337

So, I actually have only 42 Mb to use! As far as I understand, -/+ buffers/cache actually doesn't count the disk cache, so I indeed only have 42 Mb, right? I thought, I might be wrong, so I tried to switch off the disk caching and it had no effect - the picture remained the same.

So, I decided to find out who is using all my RAM, and I used top for that. But, apparently, it reports that no process is using my RAM. The only process in my top is MySQL, but it is using 0.1% of RAM and 400Mb of swap. Same picture when I try to run other services or applications - all go in swap, top shows that MEM is not used (0.1% maximum for any process).

top - 15:09:00 up  2:09,  2 users,  load average: 0.02, 0.16, 0.11
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4046868k total,  4001368k used,    45500k free,      748k buffers
Swap: 16777208k total,    68840k used, 16708368k free,    16632k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP COMMAND
 3214 ntp       15   0 23412 5044 3916 S  0.0  0.1   0:00.00  17m ntpd
 2319 root       5 -10 12648 4460 3184 S  0.0  0.1   0:00.00 8188 iscsid
 2168 root      RT   0 22120 3692 2848 S  0.0  0.1   0:00.00  17m multipathd
 5113 mysql     18   0  474m 2356  856 S  0.0  0.1   0:00.11 472m mysqld
 4106 root      34  19  251m 1944 1360 S  0.0  0.0   0:00.11 249m yum-updatesd
 4109 root      15   0 90152 1904 1772 S  0.0  0.0   0:00.18  86m sshd
 5175 root      15   0 90156 1896 1772 S  0.0  0.0   0:00.02  86m sshd

Restart doesn't help, and, by they way is very slow, which I wouldn't normally expect on this machine (4 cores, 4Gb RAM, RAID1).

So, with that - I'm pretty sure that this is not a disk cache, who is using the RAM, because normally it should have been reduced and let other processes to use RAM, rather then go to swap.

So, finally, the question is - if someone has any ideas how to find out what process is actually using the memory so heavily?

Timur

Posted 2012-03-09T14:13:35.847

Reputation: 1 415

1Did you ever find the answer to this? – Hackeron – 2015-08-23T12:23:00.167

@Hackeron: OP accepted this answer. I know that answer doesn't address your question, though. I was able to reproduce your issue on one of my servers, and I'm currently researching if there is a way to troubleshoot it.

– Deltik – 2015-08-23T12:49:26.400

@Deltik Ah, ok. Thank you :) - I have 2 servers here that leak all available memory in the space of around 12 hours, let me know if there is anything I can do to help diagnose this. I'm reachable as the nickname "hackeron" on IRC (irc.freenode.org). – Hackeron – 2015-08-23T13:54:10.223

@Hackeron: I wasn't able to find you as "hackeron" on irc.freenode.org. I did create a chatroom for extended discussion here.

– Deltik – 2015-08-23T14:52:57.203

Worth noting that the ZFS in-memory ARC (and/or L2ARC) cache does not show in free -m, but the size of it can be queried on Linux with cat /proc/spl/kstat/zfs/arcstats | grep data_size. – kqr – 2019-02-19T12:28:15.487

In top hit "M" to sort by memory used. You want to look at RES and used for used memory and not at VIRT and free which can be deceptive. See the classic https://www.linuxatemyram.com/

– gaoithe – 2019-09-19T15:47:53.530

Answers

115

On Linux in the top process you can press < key to shift the output display sort left. By default it is sorted by the %CPU so if you press the key 4 times you will sort it by VIRT which is virtual memory size giving you your answer.

Another way to do this is:

ps -e -o pid,vsz,comm= | sort -n -k 2

should give you and output sorted by processes virtual size.

Here's the long version:

ps --everyone --format=pid,vsz,comm= | sort --numeric-sort --key=2

Karlson

Posted 2012-03-09T14:13:35.847

Reputation: 2 163

2Slightly modified version to get the processes that occupy RAM and shows the full command: ps -e --format=pid,rss,args | sort --numeric-sort --key=2 – sengs – 2018-09-10T08:16:08.160

You do not want to sort by vsz but by rss as @sengs shows. rss shows actual USED memory not VIRTUAL memory which will be given back by the system. – gaoithe – 2019-09-19T15:45:57.757

That gives me Warning: bad ps syntax, perhaps a bogus '-'? See http://procps.sf.net/faq.html on Ubuntu server 11.10. – Der Hochstapler – 2012-03-09T14:36:58.053

1@OliverSalzburg The issue is -o options. RHEL4 this works. RHEL5: ps -e -o pid,vsz,comm= | sort -n -k 2 works. I'll try 11.10 later tonight but if you find the right sort options before please let me know. ps -e -o pid,vsz,comm | sort -n -k 2 might work but I don't have a place to verify at the moment. – Karlson – 2012-03-09T14:40:18.043

2I'm not really familiar with the -ef option. But this seems to produce reasonable output: sudo ps axo pid,vsz,comm=|sort -n -k 2 – Der Hochstapler – 2012-03-09T14:43:49.463

@OliverSalzburg Sorry. Amended (thought I changed it already). It should be ps -e or ps -a – Karlson – 2012-03-09T14:54:26.760

2Ty, I like the top suggestion of < I didn't know that was possible, fedora – SSH This – 2012-06-06T00:29:04.757

78

Show the processes memory in megabytes and the process path.

ps aux  | awk '{print $6/1024 " MB\t\t" $11}'  | sort -n

notnull

Posted 2012-03-09T14:13:35.847

Reputation: 791

8

Welcome to Super User. Can you expand your answer to explain what this code does and how it addresses the problem? Unexplained code is discouraged, because it doesn't teach the solution. Thanks.

– fixer1234 – 2016-02-09T22:03:09.773

11I'm surprised this answer is downvoted and has a comment asking to explain it.. it's short enough that it should be clear what it does (pipes ps aux into awk and then sort), and in the context of the question, it shows which processes are using the most RAM. I think it's a fine answer. – John – 2016-05-10T22:51:34.837

14

Just a side note on a server showing the same symptoms but still showing memory exhaustion. What ended up finding was a sysctl.conf from a box with 32 GB of RAM and setup for a DB with huge pages configured to 12000. This box only has 2 GB of RAM so it was assigning all free RAM to the huge pages (only 960 of them). Setting huge pages to 10, as none were used anyway, freed up all of the memory.

A quick check of /proc/meminfo to look for the HugePages_ settings can be a good start to troubleshooting at least one unexpected memory hog.

Death Rider

Posted 2012-03-09T14:13:35.847

Reputation: 141

3I recently had another server where this was the problem. If your organization has ex-Oracle employees in it, this setting may be your culprit. – fields – 2014-07-15T14:20:42.397

5

In my case the issue was that the server was a VMware virtual server with vmw_balloon module enabled:

$ lsmod | grep vmw_balloon
vmw_balloon            20480  0
vmw_vmci               65536  2 vmw_vsock_vmci_transport,vmw_balloon

Running:

$ vmware-toolbox-cmd stat balloon
5189 MB

So around 5 GB of memory was in fact reclaimed by the host. So despite having 8 GB to my VM "officially", in practice it was much less:

$ free
              total        used        free      shared  buff/cache   available
Mem:        8174716     5609592       53200       27480     2511924     2458432
Swap:       8386556        6740     8379816

Mitar

Posted 2012-03-09T14:13:35.847

Reputation: 239

2

I reference this and Total memory used by Python process? - Stack Overflow, that is my answer. I get a specific process (python) count tool, now.

# Megabyte.
$ ps aux | grep python | awk '{sum=sum+$6}; END {print sum/1024 " MB"}'
87.9492 MB

# Byte.
$ ps aux | grep python | awk '{sum=sum+$6}; END {print sum " KB"}'
90064 KB

Attach my process list.

$ ps aux  | grep python
root       943  0.0  0.1  53252  9524 ?        Ss   Aug19  52:01 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
root       950  0.6  0.4 299680 34220 ?        Sl   Aug19 568:52 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
root      3803  0.2  0.4 315692 36576 ?        S    12:43   0:54 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
jonny    23325  0.0  0.1  47460  9076 pts/0    S+   17:40   0:00 python
jonny    24651  0.0  0.0  13076   924 pts/4    S+   18:06   0:00 grep python

Reference

Chu-Saing Lai

Posted 2012-03-09T14:13:35.847

Reputation: 171

2

You can also use ps command to get more information about process.

ps aux | less

Atul

Posted 2012-03-09T14:13:35.847

Reputation: 146

Out of curiosity, what is the correct way to escape this command? It shows END ocne i reach the last line, it does not kill the process when i Ctrl+C it. – KingsInnerSoul – 2015-11-13T16:09:48.767

1@KingsInnerSoul press 'q' – enobayram – 2015-11-16T13:41:55.423

1

Make a script called show-memory-usage.sh with content:

#!/bin/sh
ps -eo rss,pid,user,command | sort -rn | head -10 | awk '{ hr[1024**2]="GB"; hr[1024]="MB";
 for (x=1024**3; x>=1024; x/=1024) {
 if ($1>=x) { printf ("%-6.2f %s ", $1/x, hr[x]); break }
 } } { printf ("%-6s %-10s ", $2, $3) }
 { for ( x=4 ; x<=NF ; x++ ) { printf ("%s ",$x) } print ("\n") }
 '

Felipe

Posted 2012-03-09T14:13:35.847

Reputation: 921

7Why? What does this do? How does it work? Don't tell people to run random code; explain its purpose and how it works. – a CVn – 2016-10-21T11:01:20.620

2

Figure I'll explain the code for those that don't understand as it appears to be safe to run, but the downvote may ward off those it would be useful towards. It's running the same command as is in above answers, but it's adding formatting with AWK. I've not personally run the script as I have no use for it, but explaining it helps those in need of some formatting.

– Dooley_labs – 2017-01-04T05:05:39.803

1I've read the code and run it. It aligns fields like a table, and formats consumed resident memory with prefixes (like 1.12 GB, 582.79 MB). – Stéphane Gourichon – 2019-03-26T11:17:32.113

0

This also takes the process id, sorts by MB used, and outlines the command (that created the process):

ps aux | awk '{print $6/1024 " MB\t\t" $2 "\t" $11}' | sort -n

prosti

Posted 2012-03-09T14:13:35.847

Reputation: 99

0

My ubuntu server DISTRIB RELEASE=18.04 on Hyper-V had most of memory used, but all processes were fine. (Admitted I've removed snapd and unattended-upgr packages, but 95% of memory were still used.)

The answer is Hyper-V has dynamic memory, so it took memory for main system use and ubuntu flagged it as used.

Hope it helps someone.

Vodyanikov Andrew Anatolevich

Posted 2012-03-09T14:13:35.847

Reputation: 1