There are few timeouts that may terminate your shell session or your SSH connection after it gets completely idle. Usually you get notified. The fact your SSH client disconnects only after you try to do something means it doesn't know the connection is broken, tries to use it and only then considers it terminated.
A probable cause is some stateful network node between the client and the server "forgot" the connection state because no packets had traveled for a long time. The device considered the connection terminated and freed its resources. E.g. it might be your home router that implements NAT.
Normally a TCP connection is terminated by exchanging and acknowledging FIN packets. This way either end knows the connection is terminated. Also a device in between (like your home router with NAT) that monitors the connection knows it can now forget about it.
But sometimes devices (end devices or intermediate ones) are allowed to treat a connection as terminated, without FIN packets. This is in case one or both ends are physically disconnected, forcefully killed, buggy etc. Data just stops flowing and you don't want to handle the connection forever in an eternal hope that it continues some day. Such "immortal" connections would accumulate and exhaust the resources of the device. Forgetting about them after some timeout is a good thing.
But if your particular connection is completely idle, it may exceed the timeout. Only after you try to send more data later, you discover the connection is broken. Note if an intermediate device is the culprit then the other end (the server in your case) may still "think" the connection is established.
Even if you could reconfigure the intermediate device and increase its timeout, this is not a solution. Some timeout is needed. This may be a part of a solution if the timeout is insanely low (and I don't suspect it is).
The real solution is to exchange some packets from time to time, so the connection is not completely idle. If you send a packet before the relevant timeout expires, the timeout should be reset.
There are few ways to make the connection seem busy despite your shell session being idle:
TCP keepalive. Please see this answer of mine, the first part of the Server-side story section. Extra notes to address your case better:
TCPKeepAlive
belongs to ssh
/sshd
configuration on client and server side. This means you can have TCPKeepAlive yes
in your ssh_config
on the client side or/and in the sshd_config
on the server side.
- If your connections already use
TCPKeepAlive yes
and my hypothesis about an intermediate device is right, then the tcp_keepalive_time
is probably too high to prevent the device from timeout. You may consider lowering the parameter.
- Note
TCPKeepAlive
in ssh
/sshd
configuration enables the feature for SSH connections but other settings (like tcp_keepalive_time
) are system-wide.
The main purpose of this mechanism is to allow the OS to tell whether a connection that seems idle is really idle or not. Renewing timeout(s) of intermediate device(s) is a side effect. I think an intermediate device (like a router implementing NAT) may generate TCP keepalive messages (impersonating real participants of the connection) to check if it can "forget" the connection without consequences. In your case, if such device is the culprit, it obviously doesn't do it.
SSH-specific ClientAliveInterval
and ServerAliveInterval
. The former belongs to sshd_config
(on the server) and the latter belongs to ssh_config
(on the client). See man 5 sshd_config
and man 5 ssh_config
for details. Note you can also specify options (that belong to ssh_config
) by passing them to ssh
in the command line. E.g. this command:
ssh -o ServerAliveInterval=300 user@server
will make ssh
request a response from the server after 5 minutes of inactivity.
The main purpose of this mechanism is to allow sshd
/ssh
to tell whether a connection that seems idle is really idle or not (investigate ClientAliveCountMax
and ServerAliveCountMax
). Again, renewing timeout(s) of intermediate device(s) is a side effect.
Make sure there's any visible activity in your console. A background script that prints something every few minutes is cumbersome and inelegant, still it will work. You should definitely prefer ServerAliveInterval
. I'm mentioning this because
- technically this is a solution;
- if you choose to use
tmux
on the server then it will update your console every minute (because of the clock in its default status line) and this will be enough to keep the connection established.
Final notes:
- Mind the main purpose of (1) and (2). If you use any of them on one end only and the connection breaks anyway for some reason, the other end may not notice. Still a fix at any end should be enough to renew the timeout on an intermediate device when the connection is not broken yet.
- In general any connection may break and it's good to be ready for this; therefore consider
tmux
on the server anyway.