There are a number of posts across the Stack Exchange sites relating to the issue in the title, but none seem to offer that great a solution. This post I guess is a shameless attempt to bump the issue (some of the others are a few years old now) and see if anyone has a good solution to this yet - I'm thinking maybe a proven combination of options supplied to autofs, as oppossed to some workaround/monitoring script.
I just raised this as an issue against SSHFS, but there are apparently no current maintainers ...
To echo the content of the bug report, as I know people prefer not to rely on possibly ephemeral external links:
--------- BEGIN PASTE ---------
I suspect this issue is known/understood, but I can't find an open issue for it.
This has been a problem for me for some time now, and chasing threads on Stack Overflow and the like (search sshfs "Transport endpoint is not connected") suggests to me this is a years old issue that has never been addressed. Whether the fix would be in sshfs or elsewhere I don't know, but I think it's fairly reasonable to expect a reconnection attempt to be made if the reconnect option has been passed and the mount point is no longer available.
The problem appears to be when the mount is still listed under mount
, but there is no process for it under sshfs, e.g.:
Note that while I've changed some users/paths for anonymity, the issue is otherwise exactly as described, i.e. both mount points, the one currently working and the one currently not, are adjacent on both server and client (so you might expect them to either work or fail together, but they don't).
Autofs config:
cat /etc/auto.sshfs
data1 -fstype=fuse,port=22,rw,nodev,nonempty,noatime,allow_other,max_read=65536,reconnect,workaround=all,ServerAliveInterval=15,ServerAliveCountMax=3 :sshfs\#user@server1\:/storage/data1
data2 -fstype=fuse,port=22,rw,nodev,nonempty,noatime,allow_other,max_read=65536,reconnect,workaround=all,ServerAliveInterval=15,ServerAliveCountMax=3 :sshfs\#user@server1\:/storage/data2
Output of mount command:
mount | grep server1
user@server1:/storage/data1 on /mnt/sshfs/data1 type fuse.sshfs (rw,nodev,noatime,user_id=0,group_id=0,allow_other)
user@server1:/storage/data2 on /mnt/sshfs/data2 type fuse.sshfs (rw,nodev,noatime,user_id=0,group_id=0,allow_other)
Autofs status (note absence of data2 mount point specified above):
user@serverx:/home/user# systemctl status autofs
● autofs.service - Automounts filesystems on demand
Loaded: loaded (/lib/systemd/system/autofs.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2019-10-23 14:46:52 UTC; 2 months 13 days ago
Main PID: 8524 (automount)
Tasks: 17 (limit: 4915)
Memory: 8.8M
CPU: 18h 19min 6.124s
CGroup: /system.slice/autofs.service
├─8524 /usr/sbin/automount --pid-file /var/run/autofs.pid
├─8718 ssh -X -a -oClearAllForwardings=yes -oport=22 -oServerAliveInterval=15 -oServerAliveCountMax=3 -2 user@server1 -s sftp
└─8755 sshfs user@server1:/storage/data1 /mnt/sshfs/data1 -o rw,nodev,noatime,uid=0,gid=0,port=22,nonempty,allow_other,max_read=65536,reconnect,workaround=[truncated by terminal]
Now, if I umount
/mnt/sshfs/data2, then attempt an ls
of the same directory, the volume is automatically remounted. But without the umount
, ls
(here of the parent directory) yields:
user@serverx:/home/user# ls -l /mnt/sshfs/
ls: cannot access '/mnt/sshfs/data2': Transport endpoint is not connected
total 4
drwxr-xr-x 1 root root 4096 Jul 22 11:46 data1
d????????? ? ? ? ? ? data2
So, why is the volume not remounted automatically, when attempts to access it are returning "not connected" errors?
--------- END PASTE ---------
EDIT:
It occures to me I'm needlessly creating two connections here, when I could in fact just mount parent directory /storage/. But for all I can tell it's not doing any harm (I in fact just mounted a third directory and they're all working fine). Either way, would there perhaps be any advantage having all traffic go through one, parent mount point? Is there any likelihood of conflict between the two (or more) ssh processes (they're command line identical: ssh -X -a -oClearAllForwardings=yes -oport=22 -oServerAliveInterval=15 -oServerAliveCountMax=3 -2 user@server1 -s sftp)?
EDIT:
I've just done a bit of experimenting, killing various elements on client/server, and the only way I can reproduce the problem I'm occasionally seeing is by killing sshfs on the client:
Kill ssh session on server: mount recovers
Kill sftp-server on server: mount recovers
kill ssh session on client: mount recovers
kill sshfs on client: have to manually umount
, after which it again mounts on demand (as problem originally described)
So, without concrete evidence I'll suggest sshfs itself is occasionally exiting. Should autofs handle this?