8

I had an issue with docker installed with snap, and I moved to apt. This was on a production server. In order to have a low downtime, I did the following:

  1. Removed snap running containers
  2. Removed /snap/bin from PATH
  3. Installed docker as recommended here
  4. Rebuild and start the containers
  5. Disable docker from snap with sudo snap stop docker and sudo snap remove docker

Everything was ok. Next day, I tried to restart the containers used for monitoring, but the following error is raised when running sudo docker ps -a:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

The daemon is running:

root       42709  0.2  0.3 2344140 54276 ?       Ssl  Sep07   5:30 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
root       42868  0.0  0.0 1813868 5944 ?        Sl   Sep07   0:16 /usr/bin/docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 8072 -container-ip 172.19.0.3 -container-port 8072

How can I recover control of the docker daemon?

2 Answers2

18

Assume docker engine is installed in a similar way as offical doc.

The reason

In your running dockerd,

/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
                 ^^^^^^^

(Marker ^^^ is added by me to point to the position within a line, it's not a part of the shell output.)

According to daemon socket option doc, the -H fd:// means the daemon is using a file descriptor managed by systemctl. There will be no socket file /var/run/docker.sock in this case. But docker cli will try to connect to the docker daemon via the docker.sock socket file, that's where the problem comes from.

The solution

In the case I run into, the docker daemon is brought up via systemctl as a service, you can find the service file path using systemctl command, for example, (marker ^^^ is added by me to point to the position within a line, it's not a part of the shell output)

ubuntu-linux-22-04-desktop:~$ sudo systemctl status docker

● docker.service - Docker Application Container Engine Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Active: active (running) since Fri 2023-05-19 23:59:31 CST; 55s ago ...

Then, modify that file on the line how dockerd is brought up

sudo vim /lib/systemd/system/docker.service

In the opened file, find a line starts with ExecStart=/usr/bin/dockerd

[Unit]
Description=Docker Application Container Engine
...

[Service] Type=notify

the default is not to use systemd for cgroups because the delegate issues still

exists and systemd currently does not support the cgroup feature set required

for containers run by docker

ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock

...

Modify the -H argument in dockerd command to use unix socket rather than fd, change the line to

ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock --containerd=/run/containerd/containerd.sock

Save the file (in vim, command model, zz), then reload

sudo systemctl daemon-reload

Then, restart docker daemon

sudo systemctl restart docker

After docker daemon restart finishes, you should be able to see the socket file

ll /var/run/*.sock

docker cli should work now. Try something like

docker ps

Hope will help to solve your problem.

Han Ye
  • 181
1

For people like me who don't like editing files installed by package managers and/or want to avoid the hassle of re-editing after any update, I found this SO answer which pairs well with Han Ye's answer: https://stackoverflow.com/a/46204391/10464267

Edit: (thanks for the nice advice in comments! ❤️)
As mentioned in the above link, systemd supports "drop-in" files, which allow changing the default behaviour of any unit, including the ones installed by your package manager.

The simplest way to do so is to use the following command:

sudo systemctl edit docker.service --drop-in=fix-sockets

This will open a file with your default editor containing

### Editing /etc/systemd/system/docker.service.d/docker.conf
### Anything between here and the comment below will become the contents of the drop-in file

Edits below this comment will be discarded

/usr/lib/systemd/system/docker.service

...

where # ... is the actual content of the original unit you're overriding, commented.

You will need to edit this file with to look like this:

### Editing /etc/systemd/system/docker.service.d/docker.conf
### Anything between here and the comment below will become the contents of the drop-in file

[Service] ExecStart= ExecStart=/usr/bin/dockerd -H unix:///var/run/docker.sock -H fd:// --containerd=/run/containerd/containerd.sock

Edits below this comment will be discarded

Save, exit, and that's it! The drop-in is created, and the unit should be automatically reloaded with your new overrides taken in account.
However if that is not the case, you can force it by running these two commands in order:

sudo systemctl daemon-reload
sudo systemctl restart docker.service

Notes:

  • I chose to call the drop-in "fix-sockets", but you can name it anything you like.
  • Do not omit the first empty ExecStart=: it is necessary to erase the original service's ExecStart= directive.
  • This answer also implements Eyad Ahmed's comment, listening to both the unix:// and fd:// sockets in order to avoid breaking any existing socket.
  • You can also create drop-ins manually by creating /etc/systemd/system/<unit>.d/<drop-in>.conf, where <unit> is the name of the unit (i.e. docker.service) and <drop-in> is the name of the drop-in (i.e. fix-sockets), then running the two aforementioned commands to force a reload.