4

I'm creating and destroying virtual machines all the time, in order to test various services or applications, and so I'd like to use avahi to connect to them by their names so I don't have to use valuable space in my head for dynamic IP addresses that will likely be gone tomorrow anyway. This doesn't always seem to work.

I currently have two CentOS 6.3 virtual machines, both running avahi-daemon, but one of them can't be reached by its name.

The problem machine:

error@underground ~ $ ssh nagios.local
ssh: Could not resolve hostname nagios.local: Name or service not known

The working machine:

error@underground ~ $ ssh puppet.local
error@puppet.local's password: 

Yet I can see it on the network: (underground is the host machine from which I'm working)

error@underground ~ $ avahi-browse -at
+    br0 IPv4 puppet                                        SSH Remote Terminal  local
+    br0 IPv4 nagios                                        SSH Remote Terminal  local
+    br0 IPv4 puppet [52:54:00:d0:31:c7]                    Workstation          local
+    br0 IPv4 nagios [52:54:00:93:ec:af]                    Workstation          local
+    br0 IPv4 underground [6c:62:6d:d1:df:ad]               Workstation          local
+ virbr0 IPv4 underground [52:54:00:8e:60:30]               Workstation          local

Based on feedback, the output from getent hosts:

error@underground ~ $ getent hosts nagios.local
error@underground ~ $ getent hosts puppet.local
192.168.12.146  puppet.local

On nagios.local, the unreachable virtual machine, avahi-daemon is (obviously) installed and running, and I have the proper hole punched in the firewall:

 pkts bytes target     prot opt in     out     source               destination
   74 15950 ACCEPT     udp  --  *      *       0.0.0.0/0            224.0.0.251         state NEW udp dpt:5353 

Syslog on nagios.local gives me absolutely no clue what might be happening:

Jul 18 04:24:18 nagios avahi-daemon[1384]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.12.132.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Found user 'avahi' (UID 70) and group 'avahi' (GID 70).
Jul 18 04:24:18 nagios avahi-daemon[1476]: Successfully dropped root privileges.
Jul 18 04:24:18 nagios avahi-daemon[1476]: avahi-daemon 0.6.25 starting up.
Jul 18 04:24:18 nagios avahi-daemon[1476]: WARNING: No NSS support for mDNS detected, consider installing nss-mdns!
Jul 18 04:24:18 nagios avahi-daemon[1476]: Successfully called chroot().
Jul 18 04:24:18 nagios avahi-daemon[1476]: Successfully dropped remaining capabilities.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Loading service file /services/ssh.service.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Joining mDNS multicast group on interface eth0.IPv4 with address 192.168.12.132.
Jul 18 04:24:18 nagios avahi-daemon[1476]: New relevant interface eth0.IPv4 for mDNS.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Network interface enumeration completed.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Registering new address record for 2001:db8:1600:80bf:5054:ff:fe93:ecaf on eth0.*.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Registering new address record for 192.168.12.132 on eth0.IPv4.
Jul 18 04:24:18 nagios avahi-daemon[1476]: Registering HINFO record with values 'X86_64'/'LINUX'.
Jul 18 04:24:19 nagios avahi-daemon[1476]: Server startup complete. Host name is nagios.local. Local service cookie is 3129794608.
Jul 18 04:24:19 nagios avahi-daemon[1476]: Service "nagios" (/services/ssh.service) successfully established.

The primary difference between these two installations is that puppet.local was installed as a "Desktop" installation, while nagios.local was installed as a "Minimal" instalation and had the various avahi related packages installed later.

I'm at a loss to figure out why I can't resolve this machine's name. What completely obvious thing did I miss?

Update: Based on mgorven's recommendation, I checked the host again and found that it didn't have nss-mdns installed. So I installed it, and now the problem is exactly reversed! As seen from the host:

error@underground ~ $ getent hosts puppet.local
error@underground ~ $ getent hosts nagios.local
192.168.12.132  nagios.local
Michael Hampton
  • 237,123
  • 42
  • 477
  • 940
  • What does a packet capture tell you? – womble Jul 18 '12 at 04:45
  • Is the problem a specific name which can't be resolved from anywhere, or a specific machine which can't resolve any names? Check whether machines can resolve their own name. (You can test with `getent hosts puppet.local` instead BTW.) – mgorven Jul 18 '12 at 04:45
  • For those of you playing at home, I ultimately gave up on Avahi as unworkable and went to FreeIPA. More work to set up, but everything Just Works. – Michael Hampton Aug 27 '13 at 05:19

1 Answers1

3

My guess is that the NSS library is not configured to consult mDNS when looking up hostnames, and so when programs lookup the hostname it fails (even though Avahi itself is picking up the name). Check that the nss-mdns package is installed (it seems to only be available in EPEL, not CentOS itself), and that the hosts line in /etc/nsswitch.conf contains the mdns4 (or mdns4_minimal) database. It should look something like this:

hosts:      files mdns4_minimal [NOTFOUND=return] dns

You can test hostname lookups with getent hosts <hostname>.

mgorven
  • 30,036
  • 7
  • 76
  • 121
  • Neither VM has the `nss-mdns` package installed. My host underground does, though, and that line already exists in the host's `/etc/nsswitch.conf`. I also added output from `getent hosts` to the question. – Michael Hampton Jul 18 '12 at 07:52
  • Installing `nss-mdns` on the host machine resolved the issue, but only after it was restarted (for a kernel update, not for this issue). – Michael Hampton Jul 23 '12 at 17:45