2

I have a cgroup setup that I use to limit memory usage for rsync and firefox, primarily. It is a manual setup (no systemd involvement) that I run from a Makefile after each boot (yeah I know not great, but systemd is such a hassle to learn).

When I upgraded from ubuntu 21.04 to 22.04 today, the law of unintended consequences ensured that my cgroup setup no longer works, probably because cgroup V2 is now standard.

What it looks like to me is that the file hierarchy in /sys/fs/cgroups has changed. I used to have, for example for my user reik

/sys/fs/cgroup/memory/custom/reik-2048M

which I created with a Makefile snippet that generates the following script

MEM=2048M
sudo cgcreate -a reik -t reik -g memory:/custom/reik-$(MEM)
sudo cgset -r memory.limit_in_bytes=$(MEM) custom/reik-$(MEM)
sudo cgset -r memory.memsw.limit_in_bytes=$(MEM) custom/reik-$(MEM)

I could then run a program as follows

% gexec --sticky -g memory:custom/reik-2048M someprogram

With cgroup v2 the memory.blah variable names had changed, which I think I fixed, as follows

sudo cgcreate -a reik -t reik -g memory:/custom/reik-2048M
sudo cgset -r memory.max=2048M custom/reik-2048M
sudo cgset -r memory.swap.max=2048M custom/reik-2048M

But now I get an error if I try to run any program, say ls, as follows

% cgexec --sticky -g memory:custom/reik-2048M ls
cgroup change of group failed

One thing that seems different is that the file hierarchy no longer has the /memory/ component in it, that is, the path that gets created is just

/sys/fs/cgroup/custom/reik-2048M

That is as far as I got. I am looking for ideas on what I am doing wrong here. I also looked for docs about how to translate cgroup v1 commands into cgroup v2, but did not find anything very concrete.

reikred
  • 551

1 Answers1

1

tl;dr:

systemd-run --user -G -P -d -p MemoryMax=2G -p MemorySwapMax=2G ls

This will create a .service on the fly, with the specified settings and command. (For .service units the same settings would go into the [Service] section.) The --user makes it run under your own service manager rather than the system-wide one.

A predefined cgroup can be created as ~/.config/systemd/user/foo.slice and used via --slice with systemd-run (or Slice= in real .service units):

[Slice]
MemoryMax=2G
MemorySwapMax=2G

systemd-run --user --collect --slice=foo.slice --pty --shell

(Note about systemd's slice naming: Dashes are hierarchy separators, i.e. foo-bar.slice is a child of foo.slice.)

What it looks like to me is that the file hierarchy in /sys/fs/cgroups has changed. […] One thing that seems different is that the file hierarchy no longer has the /memory/ component in it, that is, the path that gets created is just

Yes, that's a big part of why it's cgroups "v2". The new cgroups system does not have individual per-controller trees anymore – there's only a single tree for everything, with subtrees having specific controllers enabled via their cgroup.controller files (cgcreate does this automatically).

In my tests, however, CGROUP_LOGLEVEL=debug says that the error message is because kernel doesn't allow the PID to be moved into the new cgroup. I'm not 100% sure as to why, but most likely the reason is that you're failing the "should be authorized to migrate to the common ancestor" rule in cgroup.c – the only common ancestor in this case is the root / cgroup, which you have no permissions for.

(systemd-run doesn't exec the process directly – instead it asks the systemd --user process to execute it, so both the source and destination cgroups are under the [...]/user@1000.service common ancestor which is owned by your UID and therefore passes the migration check.)


This leads to another difference worth mentioning – in the v2 model, a cgroup can either have tasks or subgroups but not both at once, so you couldn't actually move any processes into an ancestor cgroup; the previous paragraph uses "authorized to migrate" strictly in the sense of having write permissions to the common ancestor's cgroup.procs.

grawity
  • 501,077