163

Before actually asking, just to be clear: yes, I know about disk cache, and no, it is not my case :) Sorry, for this preamble :)

I'm using CentOS 5. Every application in the system is swapping heavily, and the system is very slow. When I do free -m, here is what I got:

             total       used       free     shared    buffers     cached
Mem:          3952       3929         22          0          1         18
-/+ buffers/cache:       3909         42
Swap:        16383         46      16337

So, I actually have only 42 Mb to use! As far as I understand, -/+ buffers/cache actually doesn't count the disk cache, so I indeed only have 42 Mb, right? I thought, I might be wrong, so I tried to switch off the disk caching and it had no effect - the picture remained the same.

So, I decided to find out who is using all my RAM, and I used top for that. But, apparently, it reports that no process is using my RAM. The only process in my top is MySQL, but it is using 0.1% of RAM and 400Mb of swap. Same picture when I try to run other services or applications - all go in swap, top shows that MEM is not used (0.1% maximum for any process).

top - 15:09:00 up  2:09,  2 users,  load average: 0.02, 0.16, 0.11
Tasks: 112 total,   1 running, 111 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4046868k total,  4001368k used,    45500k free,      748k buffers
Swap: 16777208k total,    68840k used, 16708368k free,    16632k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  SWAP COMMAND
 3214 ntp       15   0 23412 5044 3916 S  0.0  0.1   0:00.00  17m ntpd
 2319 root       5 -10 12648 4460 3184 S  0.0  0.1   0:00.00 8188 iscsid
 2168 root      RT   0 22120 3692 2848 S  0.0  0.1   0:00.00  17m multipathd
 5113 mysql     18   0  474m 2356  856 S  0.0  0.1   0:00.11 472m mysqld
 4106 root      34  19  251m 1944 1360 S  0.0  0.0   0:00.11 249m yum-updatesd
 4109 root      15   0 90152 1904 1772 S  0.0  0.0   0:00.18  86m sshd
 5175 root      15   0 90156 1896 1772 S  0.0  0.0   0:00.02  86m sshd

Restart doesn't help, and, by they way is very slow, which I wouldn't normally expect on this machine (4 cores, 4Gb RAM, RAID1).

So, with that - I'm pretty sure that this is not a disk cache, who is using the RAM, because normally it should have been reduced and let other processes to use RAM, rather then go to swap.

So, finally, the question is - if someone has any ideas how to find out what process is actually using the memory so heavily?

Tim
  • 1,777

9 Answers9

141

On Linux in the top process you can press < key to shift the output display sort left. By default it is sorted by the %CPU so if you press the key 4 times you will sort it by VIRT which is virtual memory size giving you your answer.

Another way to do this is:

ps -e -o pid,vsz,comm= | sort -n -k 2

should give you and output sorted by processes virtual size.

Here's the long version:

ps --everyone --format=pid,vsz,comm= | sort --numeric-sort --key=2
Karlson
  • 2,493
123

Show the processes memory in megabytes and the process path.

ps aux  | awk '{print $6/1024 " MB\t\t" $11}'  | sort -n
Weslor
  • 103
  • 3
notnull
  • 1,241
14

Just a side note on a server showing the same symptoms but still showing memory exhaustion. What ended up finding was a sysctl.conf from a box with 32 GB of RAM and setup for a DB with huge pages configured to 12000. This box only has 2 GB of RAM so it was assigning all free RAM to the huge pages (only 960 of them). Setting huge pages to 10, as none were used anyway, freed up all of the memory.

A quick check of /proc/meminfo to look for the HugePages_ settings can be a good start to troubleshooting at least one unexpected memory hog.

10

Make a script called show-memory-usage.sh with content:

#!/bin/sh
ps -eo rss,pid,user,command | sort -rn | head -$1 | awk '{ hr[1024**2]="GB"; hr[1024]="MB";
 for (x=1024**3; x>=1024; x/=1024) {
 if ($1>=x) { printf ("%-6.2f %s ", $1/x, hr[x]); break }
 } } { printf ("%-6s %-10s ", $2, $3) }
 { for ( x=4 ; x<=NF ; x++ ) { printf ("%s ",$x) } print ("\n") }
 '

Make it executable with chmod +x show-memory-usage.sh and call it like this ./show-memory-usage.sh 10 (10 => show max 10 lines)

More "human-readable" version:

#!/bin/sh

This script displays the top memory-consuming processes in a human-readable format.

It accepts one argument: the number of top processes to display.

List processes with their memory usage, PID, user, and command line,

then sort them by memory usage in descending order and pick the top ones as specified by the script's argument.

ps -eo rss,pid,user,command | sort -rn | head -$1 | awk ' BEGIN { # Define human-readable memory size units. hr[1024**2]="GB"; hr[1024]="MB"; } { # Convert the memory usage to a human-readable format. for (x=1024**3; x>=1024; x/=1024) { if ($1>=x) { printf ("%-6.2f %s ", $1/x, hr[x]); break; } } } { # Print the process ID and user. printf ("%-6s %-10s ", $2, $3); } { # Print the command line, handling commands with spaces. for (x=4; x<=NF; x++) { printf ("%s ", $x); } print ("\n"); # Ensure each process info is on a new line. } '

Output Example:

5.54   GB 12783  mysql      /usr/sbin/mysqld

1.02 GB 27582 root /usr/local/cpanel/3rdparty/bin/clamd

131.82 MB 1128 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start inventory.mass.update --single-thread --max-messages=10000

131.21 MB 1095 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start product_action_attribute.update --single-thread --max-messages=10000

131.19 MB 1102 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start product_action_attribute.website.update --single-thread --max-messages=10000

130.80 MB 1115 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start exportProcessor --single-thread --max-messages=10000

130.69 MB 1134 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start inventory.reservations.update --single-thread --max-messages=10000

130.69 MB 1131 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start inventory.reservations.cleanup --single-thread --max-messages=10000

130.69 MB 1107 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start codegeneratorProcessor --single-thread --max-messages=10000

130.58 MB 1120 company+ /opt/cpanel/ea-php73/root/usr/bin/php /home/companyde/redesign.company.de/bin/magento queue:consumers:start inventory.source.items.cleanup --single-thread --max-messages=10000

Script explanation

  1. Shebang (#!/bin/sh): Indicates that the script should be executed with the Bourne shell, a common command interpreter.

  2. Process Listing and Sorting (ps -eo rss,pid,user,command | sort -rn | head -$1):

    • The ps command lists processes with specific details: rss (memory usage), pid (process ID), user (the process owner), and command (the command that initiated the process).
    • The output is sorted in reverse numeric order (sort -rn) based on memory usage, ensuring the most resource-intensive processes are listed first.
    • head -$1 limits the output to the top $1 processes, where $1 is the script's input argument.
  3. AWK Script:

    • The AWK script is structured for readability and is divided into distinct parts, each performing a specific function, with comments explaining each part.
    • BEGIN Block: Sets up an associative array hr that maps memory sizes to their units (GB and MB), preparing for memory size conversion.
    • Memory Conversion Loop: Iterates through possible memory sizes (GB, MB), converting the RSS value from kilobytes (the default unit in ps) to a more readable format. It prints the converted value along with its unit.
    • Printing Process Details: Outputs the process ID and the user in a formatted manner, ensuring alignment and readability.
    • Command Line Printing: Handles commands with spaces correctly by iterating from the fourth field to the last ($4 to $NF). This ensures the entire command line that started the process is printed accurately. It concludes with printing a newline character to separate each process's details.
Felipe
  • 2,338
8

In my case the issue was that the server was a VMware virtual server with vmw_balloon module enabled:

$ lsmod | grep vmw_balloon
vmw_balloon            20480  0
vmw_vmci               65536  2 vmw_vsock_vmci_transport,vmw_balloon

Running:

$ vmware-toolbox-cmd stat balloon
5189 MB

So around 5 GB of memory was in fact reclaimed by the host. So despite having 8 GB to my VM "officially", in practice it was much less:

$ free
              total        used        free      shared  buff/cache   available
Mem:        8174716     5609592       53200       27480     2511924     2458432
Swap:       8386556        6740     8379816
Mitar
  • 319
3

I reference this and Total memory used by Python process? - Stack Overflow, that is my answer. I get a specific process (python) count tool, now.

# Megabyte.
$ ps aux | grep python | awk '{sum=sum+$6}; END {print sum/1024 " MB"}'
87.9492 MB

# Byte.
$ ps aux | grep python | awk '{sum=sum+$6}; END {print sum " KB"}'
90064 KB

Attach my process list.

$ ps aux  | grep python
root       943  0.0  0.1  53252  9524 ?        Ss   Aug19  52:01 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
root       950  0.6  0.4 299680 34220 ?        Sl   Aug19 568:52 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
root      3803  0.2  0.4 315692 36576 ?        S    12:43   0:54 /usr/bin/python /usr/local/bin/beaver -c /etc/beaver/beaver.conf -l /var/log/beaver.log -P /var/run/beaver.pid
jonny    23325  0.0  0.1  47460  9076 pts/0    S+   17:40   0:00 python
jonny    24651  0.0  0.0  13076   924 pts/4    S+   18:06   0:00 grep python

Reference

2

This also takes the process id, sorts by MB used, and outlines the command (that created the process):

ps aux | awk '{print $6/1024 " MB\t\t" $2 "\t" $11}' | sort -n

prosti
  • 129
2

You can also use ps command to get more information about process.

ps aux | less
Atul
  • 156
0

My ubuntu server DISTRIB RELEASE=18.04 on Hyper-V had most of memory used, but all processes were fine. (Admitted I've removed snapd and unattended-upgr packages, but 95% of memory were still used.)

The answer is Hyper-V has dynamic memory, so it took memory for main system use and ubuntu flagged it as used.

Hope it helps someone.