0

I have an Ubuntu 22.04 Linux machine that runs some services and heavily uses ZFS file system. I observe some serious discrepancies in memory reporting when using various system tools.

For example, the free command shows the following data:

# free -gh
      total  used   free   shared  buff/cache  available
Mem:  125G   119Gi  5.2Gi  1.0M    315Mi       4.5Gi
Swap: 11Gi   0.0Ki  11Gi

The top command shows some similar data:

MiB Mem: 128492.1 total, 5506.6 free, 122668.6 used, 316.9 buff/cache

Prometheus node exporter also reports that 96% of memory is taken and shows this in red.


However, the htop shows:

Mem: 125G used: 60.4G buffers: 8.27M cache: 59.8G

I know about buffers/cache and stuff, but as you can see, the free utility doesn't report big buffers. My understanding is that ZFS reserves memory to buffer/cache some data, but this is not getting reported by system utilities, besides htop.


SO, my questions are:

  1. It looks like htop memory reporting is more accurate than all other utilities, why?

  2. How do I see how much memory is actually reserved by ZFS?

  3. Is it possible to adjust Prometheus node exporter to report it more accurately (as htop does)?

  4. The memory reserved by ZFS, is it available for other processes to be claimed?

  5. How do I know when my machine will require more RAM? I mean before it starts to use the SWAP space.

2 Answers2

2

It's probably the ZFS ARC. It's much like the Linux page cache except it's its own thing, with its own kernel memory allocation separate from "buffers/cache", because reasons.

It looks like htop memory reporting is more accurate than all other utilities, why?

Htop doesn't shy away from adding platform-specific measurements and in this case it has code to get ZFS stats specifically. You can even add it as a separate item in the header.

(Though, on the other hand, procps-ng can already report systemd and even Docker metadata in ps, so it wouldn't be much of a step for them to add ZFS ARC stats to free, but so far they just haven't.)

How do I see how much memory is actually reserved by ZFS?

Read /proc/spl/kstat/zfs/arcstats. (I remember it also had a CLI tool for human use, but can't remember what the tool was; however, htop and other automated reporting tools should get the numbers directly from /proc.)

The memory reserved by ZFS, is it available for other processes to be claimed?

In theory it is; I've been told that ARC has automatic shrinking just like the page cache. Without having used ZFS recently at all, I'll let someone else fill in whether that feature works in practice.

grawity
  • 501,077
2

If someone is interested how it is calculated in htop:

if (lhost->zfs.enabled != 0 && !Running_containerized) {
// ZFS does not shrink below the value of zfs_arc_min.
unsigned long long int shrinkableSize = 0;
if (lhost->zfs.size > lhost->zfs.min)
    shrinkableSize = lhost->zfs.size - lhost->zfs.min;

this->values[MEMORY_METER_USED] -= shrinkableSize;
this->values[MEMORY_METER_CACHE] += shrinkableSize;
this->values[MEMORY_METER_AVAILABLE] += shrinkableSize;

}