0

On my network I have two servers.

Server1 is running TrueNas(BSD) with multiple applications running in iocage jails. It's connected to the network with a 3-nic LAGG.

Server2 is an OpenMediaVault (Debian) installation with multiple applications that run in docker containers connected with a single physical nic.

Both servers are connected to the same TP-Link managed switch.

I am trying to get a jail on Server1 which runs borgbackup to make http connections to the HealthChecks container on Server2. From the borgbackup jail's console I am unable to ping the IP address of Server2 or make curl requests to the docker container, even though it is up and works from everywhere else on the subnet, including from Server1 outside of the jail. If I then go to the console of Server2 and ping the IP of the jail, after missing (timeouts) 8-10 pings all communication is possible temporarily, which leads me to believe this is an ARP issue, but I'm unsure how to solve it.

Other possibly unrelated weirdness:

  • The network is a /21 sized network, and both machines are in different /24 subnets (all nodes are configured as /21 so this shouldn't actually matter).
  • I've noticed is that jails have never been able to get IP addresses from the DHCP server (I usually do DHCP reservation for all IP assignment on the network) and I've had to manually assign IPs for all of my jails on Server1.
  • When the connection is not working pings from Server2 to the jail IP timeout. Pings from the jail to Server2 return ping: sendto: Host is down.

For the community bot: The main question is why does this communication channel fail after what I assume is the arp cache expiration, and how do I fix it?

EDIT: Disabling VNET on the jails seems to resolve the issue, but I was hoping to be able to keep VNET enabled.

Jason
  • 58
  • 8
  • The ping failure message should give you a clue if it is an ARP problem. Does it time out, or a host not found, or some other message? – Ron Maupin Dec 16 '21 at 16:00
  • They are timeouts from Server2 to the jail. Pinging from the jail to Server2 while it's not working gives ping: sendto: Host is down – Jason Dec 16 '21 at 16:09
  • Timeout means you have an ARP entry. A host not found error means you do not have an ARP entry. – Ron Maupin Dec 16 '21 at 16:12
  • If the jails cannot get DHCP nor ARP information, then broadcast is broken for the jails It sounds like they are not on the same network as the DHCP server or other server, or at lest they do not believe it (different mask). Your mention of `/24` networks actually means nothing except that you may have something with the wrong mask. All masks on the servers and jails needs to be the same (`255.255.248.0` for `/21`). – Ron Maupin Dec 16 '21 at 17:03
  • the netmask was the first thing I checked, everyone is in the same /21 or 255.255.248.0. The DHCP server, the jail, Server1's LAGG, and Server2. Mostly I see this stuff on the LAGG adapter, but not the other adapter (on a different vlan) on the server. – Jason Dec 16 '21 at 17:07
  • You have something breaking broadcast from the jails. For example, broadcasts cannot be routed, so they will not cross a layer-3 boundary. – Ron Maupin Dec 16 '21 at 17:09
  • Turning off VNET in the jail configuration and attaching directly to the lagg adapter seems to allow the communication to work. – Jason Dec 16 '21 at 18:20

0 Answers0