Key caching in SSH forwarding

Question

What I want to achieve is to connect to server A and execute a script that connects to multiple other servers (serially) and perform a task on each. While I can perform that task using ssh -A, the task takes days to complete and I want to use GNU screen.

So, I login to the server with ssh -A, open a screen session and execute the scripts that connects successfully to the first server. As long as I keep the screen session open, the script connects to the next servers without issues. If I close the screen session, when the task on server X finishes and proceeds to server X+1 it fails with authentication and when I re-attach to my screen session I see everything has failed.

All the servers require the same key to connect, is there a way to "cache" the credentials on my current screen session and leave screen running in the background?

Note that I have read about issues with ssh and screen but what I am after is not to be able to reconnect but leave it running for a week and when I re-attach the session get the results.

Kamil Maciorowski · Accepted Answer · 2023-12-01T10:15:16.210

tl;dr

See the "Easy procedure" section below.

Analysis

What you do with ssh -A works as long as the ssh connection runs and the agent runs on your local computer. Your credentials never leave the local computer, this is the whole point of forwarding an agent. Remote processes communicate with the local agent via a tunnel and relay challenges to it. The tunnel and the working agent are crucial; if any of them is terminated then remote processes will not be able to authenticate. This is the reason everything fails after you disconnect.

All the servers require the same key to connect, is there a way to "cache" the credentials on my current screen session and leave screen running in the background?

It is technically possible. You can forward a remote authentication agent to the local computer and "borrow" your local key(s). The core of the method is the fact the remote agent will be available to your remote scripts even after you disconnect.

It will be somewhat easier if you can ssh -A back from the remote server to your local computer. This answer introduces two procedures:

"Easy procedure". You can use it when you can ssh -A back from the remote server to your local computer.
"Normal procedure". It should work regardless if there is an SSH server on your local computer.

Easy procedure

This section is deliberately terse. I advise you to read the whole answer and understand what we're doing, even if the easy procedure works for you.

This procedure does not require:

any local authentication agent,
any additional local terminal (everything works in one local terminal).

This procedure requires:

local SSH server,
ability to start ssh-agent on the remote side (the procedure will start one),
your key in a local file.

The procedure:

ssh to the remote. Don't use -A:
```
# on local
ssh user@remote
```
^{In this step you can create a tunnel (e.g. -R 2222:localhost:22) that will allow you to ssh back in the next step, if such tunnel is needed.}
Start ssh-agent on the remote:
```
# on remote
eval "$(ssh-agent)"
```
ssh back to the local. Use -A:
```
# on remote
ssh -A localuser@back-to-local
```
^{If via the tunnel then it will be like ssh -A -p 2222 localuser@localhost.}
Load your local key(s) to the remote agent:
```
# back on local
ssh-add
```
^{Add non-default key(s) if you like.}
Exit the "back on local" shell:
```
# back on local
exit
```
Proceed in the remote shell like you would in your original method just after (local) ssh -A: run screen, the scripts etc.; disconnect.
When you reattach later and see everything is done and the agent is no longer needed, terminate the agent:
```
# on remote
ssh-agent -k 
```

Easy procedure vs normal procedure

In the easy procedure (above) we use additional ssh -A localuser@back-to-local to forward a remote authentication agent to the local computer for a while.

The normal procedure (below) creates a tunnel embedded in the regular ssh user@remote connection, essentially for the same purpose. There are few additional technical commands for "plumbing" and maintenance.

Normal procedure

This procedure does not require:

any local authentication agent,
any local SSH server.

This procedure requires:

an additional local terminal (shell),
ability to start ssh-agent on the remote side (the procedure will start one),
your key in a local file (so if your original setup relies on a local authentication agent that gets the key from elsewhere, then this answer won't work for you).

Throughout the entire procedure I assume a constant local working directory and a constant remote working directory which I will denote /home/me/ (you need to replace this string with your real remote working directory).

On your local machine, make sure ssh-agent-socket does not exist. It's an arbitrary name and you can change it if you want, but once you choose a name, stick to it. Remove the file if needed:
```
# on local
rm -f ssh-agent-socket
```
SSH from your local to the remote. Do not use ssh -A. Use ssh -L to build a tunnel from the local socket to a remote symlink ssh-agent-symlink. It's another arbitrary name, this time for a file on the remote side. It's OK if this file does not exist yet. This is the command:
```
# on local
ssh -L ./ssh-agent-socket:/home/me/ssh-agent-symlink user@remote
```
The command should create a local socket named ssh-agent-socket. We will use it in a moment.
Assuming the above command worked, now you're in a shell on the remote machine. Start ssh-agent there:
```
# on remote
eval "$(ssh-agent)"
```
Still on the remote machine, create a symlink to the socket the newly started agent uses. Note we need to remove any old symlink beforehand:
```
# on remote
cd /home/me/
rm -f ssh-agent-symlink
ln -s "$SSH_AUTH_SOCK" ssh-agent-symlink
```
^{We need this symlink only because I chose to create the tunnel (ssh -L) beforehand, when we didn't know what the location of the socket would be. It's possible to create a tunnel after running ssh-agent and to point it to the right remote socket directly. If you choose to do this then you won't need a symlink, but you will probably need to "transfer" (manually type or copy-paste) the path from remote echo "$SSH_AUTH_SOCK" when creating the tunnel. Thanks to the symlink the code in this answer is more static, so let's stick to it.}
Open an additional local terminal (shell) and …

^{Note at this point you cannot e.g. start screen on the remote, disconnect and work in the same local terminal. We still need the tunnel to work, ssh must not be disrupted. The easiest method is to really use a separate local terminal in this step.}

… and navigate (cd) to the directory where ssh-agent-socket is. Thanks to the tunnel and to the symlink on the remote side, the socket now leads to the remote authentication agent. Any local tool that connects to the socket will ultimately talk to the remote agent. Set the right variable to the right value, so ssh-add (we will use it in a moment) can find the socket:
```
# on local
export SSH_AUTH_SOCK="$PWD/ssh-agent-socket"
```
Add your local key(s) to the remote agent. The below command adds keys from default locations; modify the command if needed. It's not different from adding key(s) to a local agent, because in either case the command connects to some socket and it doesn't care where it leads to, as long as there is some authentication agent on the other end.
```
# on local
ssh-add
```
Verify if the agent holds the right key(s). This command should give you the same output on both machines, because it consults the same agent:
```
# on local or on remote
ssh-add -l
```
The key(s) you want should be listed. Note the command, when run on the remote, will still show you the local path(s) to your key(s). It's OK, the agent stores paths as additional information and it doesn't really know from which computer they came from.
Cleaning (optional). Remove the remote symlink and the local socket:
```
# on remote
rm ssh-agent-symlink
```
```
# on local
rm ssh-agent-socket
```
The additional local shell is no longer needed. You may close it now.

You can also reconfigure the still-running ssh and close the tunnel, if you know how, but it doesn't really matter and I won't explain the method here; the tunnel is no longer needed but it may stay.
Proceed in the remote shell like you would in your original method just after (local) ssh -A. The setup is almost identical: you're on the remote server and there is an authentication agent you can reach thanks to the $SSH_AUTH_SOCK environment variable, the agent holds the right key(s). Everything should work like it did, except…

The difference is the agent runs in the remote system and does not rely on any SSH tunnel. So if you disconnect after setting screen and stuff up, your processes will still be able to talk to the agent.
When you reattach later and see everything is done and the agent is no longer needed, terminate the agent:
```
# on remote
ssh-agent -k
```
Note this will only work in a shell where the right $SSH_AGENT_PID is in the environment, so basically in the screen session you created, not in any new shell in general.

Leaving the agent running is wrong. Not only it's a no-longer-needed process; it still holds your local key(s), so it's unwise to keep it running longer than necessary.

Consider automating this. I mean if normally you run your remote scripts with ./run-my-scripts then now you should run ./run-my-scripts; ssh-agent -k instead and this will take care of the agent as soon as the scripts finish.

Security considerations

Forwarding a local agent with ssh -A or "borrowing" local keys to a remote agent (like in our case) is more secure than copying local key(s) to the remote system as files. See what man 1 ssh says:

Agent forwarding should be enabled with caution. Users with the ability to bypass file permissions on the remote host (for the agent's UNIX-domain socket) can access the local agent through the forwarded connection. An attacker cannot obtain key material from the agent, however they can perform operations on the keys that enable them to authenticate using the identities loaded into the agent. […]

It's not explicitly stated, but the concern is valid for any authentication agent, not only in case of ssh -A. It's considered in the context of ssh -A because before ssh -A you usually deal with an agent that loads keys from regular files present in the same system; so if an attacker (e.g. a dishonest admin with root access) can bypass file permissions of the socket then they probably can bypass file permissions of the keys and get the keys in the first place. Access to the keys is more valuable because one can copy a key file and use it later. Access to an agent is less valuable because one can use the agent to authenticate only as long as the agent runs.

In our case a remote agent holds your local keys that don't exist as files in the remote OS, so the situation is similar to regular ssh -A even in the "normal procedure" where we don't use ssh -A: there are no precious files to steal on the remote side, an attacker there may try to get less precious access to the agent.

However our situation is less secure than a case of regular ssh -A from local to remote, because a remote attacker may try to peek at the memory of the running agent, the keys are there. This angle of attack is not available to them in case of a regular ssh -A where the agent runs in your local computer. In other words: our procedure ("easy" or "normal") does copy your key(s) to the remote system, but at least to the memory of the remote ssh-agent process, not to a remote filesystem.
Locally anyone with access to ssh-agent-socket is able to use the agent, if the tunnel still works and if the remote agent still works and if they initiate their connection before you remove the socket or the remote symlink. If you are the only user then it's safe. If there are other users then you'd better work (i.e. create ssh-agent-socket) in a directory nobody else has access to. It's true ssh-agent-socket created by our ssh -L reports as rw------- (600), but I'm not sure if it was created like this. I cannot tell if there was a time window just after the creation, when the mode may have been less restrictive, and it was fixed "right away". It may have been a window of opportunity for an attacker when the mode was less restrictive.

Your question is tagged linux. In Linux permissions for sockets matter. In general they may be irrelevant. See man 7 unix:

On Linux, connecting to a stream socket object requires write permission on that socket; sending a datagram to a datagram socket likewise requires write permission on that socket. POSIX does not make any statement about the effect of the permissions on a socket file, and on some systems (e.g., older BSDs), the socket permissions are ignored. Portable programs should not rely on this feature for security.

Anyway, when there are more local users, a good practice for us is to create a private directory with mktemp -d and work in it. mktemp -d creates a directory inaccessible to others from its very birth. In the "easy procedure" a local socket should be created in a private directory automatically (sshd by itself does the job of mktemp -d), so you don't have to worry.

All this is irrelevant in case of a local attacker who can bypass file permissions, e.g. a dishonest admin with root access, but it matters for securing against rogue regular users.
On the remote side you're safe from rogue regular users, at least on Linux. Even if you put ssh-agent-symlink in a directory accessible to others, the mode of the target file will matter; in our case: of the socket created by ssh-agent. This socket should already be secured (in my Debian it gets created in a private temporary directory, ssh-agent by itself does the job of mktemp -d). It seems there are systems where permissions of the symlink matter. When in doubt, modify the procedure so the remote symlink is created in a private directory; or create the tunnel after ssh-agent and point to the remote socket directly, you won't need a symlink then.