0

We want to mount external storage that has been provided to us. We use something like the following in our /etc/fstab file on Ubuntu 18.04.

//external/storage /mounting/point cifs noperm,cred=/home/user/.smbcredentials,domain=WORK,iocharset=utf8,vers=3.0,sec=ntlmv2i,uid=user,gid=WORKGROUP,dir_mode=0770,file_mode=0770 0 0

And .smbcredentials

user=user
password=pass

Unfortunately we are running into mounting issues where the shares sometimes become inaccessible. Strangely enough, we could access it with smbclient and even mount on other OS's (MacOS, Windows). After asking for feedback from the storage's sysadmin, we were told that this is caused by the nature of how the datacenter is set up: the IPs can change dynamically. We were also told that this is not a problem for Windows or MacOS, but that on Linux this causes far-reaching problems because the IP of the remote storage is cached. Thus, if the IP of its host changes, the share cannot be found anymore on the client because the cached IP is incorrect.

My question then is: how do we deal with our set up? We were recommended to only access the shares when we need them with smbclient but never actually mounting them. I definitely do not want to go this direction because we use that remote storage as the data storage for running program tasks. So it should be available at all times. Idally, I am looking for a way to disable the IP caching all together but other suggestions are welcome, too. At the moment my eye is on autofs, though I have no experience with it and I am not sure whether it also caches the IP of the shares it should connect to.

PS: it's also odd to me that - if the analysis of sysadmin is correct - a share becomes unavailable after the IP change (Host is down) but that unmounting and re-mounting does not work. sudo mount -av just hangs.

(Originally asked over at Ask Ubuntu, but it seems a better fit here.)

  • The IPs of a server can change dynamically?! That is absurd, and it is the problem that first needs to be fixed. It's probably causing all sorts of other issues that people have had to attempt to work around or just live with. – Michael Hampton Jul 30 '20 at 15:14
  • @MichaelHampton Maybe I worded this incorrectly, probably because I am not very knowledgeable about the server stuff. The reason they gave is (I quote/translate): "the backend of the shared storage is dynamic, meaning that the active server of a given storage may change. Also: GNS is ufiler where we work with two sets of IPs which can change freely between frontend servers. And finally: Ctdb also may change IPs but this should only happen every three months or so due to high load (e.g. stuck puppet run)." – Bram Vanroy Jul 30 '20 at 15:25
  • OK, that makes a bit more sense. It still smells a little funny to me, but I'll have to let someone more experienced with such storage comment on it. – Michael Hampton Jul 30 '20 at 15:40
  • @MichaelHampton Thanks. For now I have mounted using autofs with a timeout of 15 mins. The future will show whether that is helpful. I am still open to other or better approaches. – Bram Vanroy Jul 30 '20 at 21:09

1 Answers1

0

If I understand the setup correctly, you refer to your storage host by name and the IP for that hostname changes frequently.

The name to IP mapping is a function of DNS. DNS is where you need to address the caching. I would suggest flushing your cache frequently to resolve this problem. Though this can affect other services on the host, and does seem a little crazy.

First, check the statistics for your cache:

sudo systemd-resolve --statistics

You can flush your caches with this command:

sudo systemd-resolve --flush-caches

Then check the statistics again to be sure the cache is flushed.

If that works, I would add the command to flush the cache into your script to mount the drive. This way the script will first flush the cache, then query DNS for the current IP when it attempts to mount the drive.

Dre
  • 1,375
  • 6
  • 12
  • Thanks for your reply. What are the possible downsides of this? – Bram Vanroy Aug 04 '20 at 18:47
  • The main concern with this is that when flushing out the DNS cache it is all or nothing, and other services on the host may be expecting those other entries. Now, it should really only take the system a few milliseconds to query and repopulate the cache with any needed entries, so it is likely you would not see a problem with this at all. – Dre Aug 04 '20 at 20:17