2

I have a working NFS4 setup. The server is called bluebox.lan and it exports:

/mnt bluescreen.lan(rw,no_root_squash,crossmnt)

The client is called bluescreen.lan and it is able to mount bluebox's nfs using:

mount -t nfs4 -o nfsvers=4.2 bluebox.lan:/mnt /bluebox

Now I added kerberos to the mix. bluebox is the kdc for realm "LAN".

created these principals:

host/bluebox.lan@LAN
mathijs@LAN
mathijs/admin@LAN
nfs/bluebox.lan@LAN
nfs/bluescreen.lan@LAN

then I added nfs/bluebox.lan to bluebox's keytab and nfs/bluescreen.lan to bluescreen's keytab.

I validated kerberos itself works:

$ kinit mathijs@LAN
Password for mathijs@LAN:

$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: mathijs@LAN

Valid starting     Expires            Service principal
29-10-19 10:20:07  30-10-19 10:20:07  krbtgt/LAN@LAN

Then I changed /etc/exports on bluebox (only added ,sec=krb5):

/mnt bluescreen.lan(rw,no_root_squash,crossmnt,sec=krb5)

I restarted involved services (nscd, rpcbind, rpc-statd and rpc-gssd on both machines, plus nfs-server on bluebox.

Now I try mounting on bluescreen:

mount -t nfs4 -o nfsvers=4.2,sec=krb5 bluebox.lan:/mnt /bluebox -vvvvvv
mount.nfs4: timeout set for Tue Oct 29 10:24:54 2019
mount.nfs4: trying text-based options 'nfsvers=4.2,sec=krb5,addr=192.168.22.2,clientaddr=192.168.22.5'
mount.nfs4: mount(2): Permission denied
mount.nfs4: access denied by server while mounting bluebox.lan:/mnt

I get no messages in the journals of either machine.

However, I do notice that kerberos worked in the background to arrange tickets. On bluescreen(the client) /tmp/krb5ccmachine_LAN appeared, containing:

# klist -e /tmp/krb5ccmachine_LAN
Ticket cache: FILE:/tmp/krb5ccmachine_LAN
Default principal: nfs/bluescreen.lan@LAN

Valid starting     Expires            Service principal
29-10-19 10:16:36  30-10-19 10:16:36  krbtgt/LAN@LAN
    Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96 
29-10-19 10:16:36  30-10-19 10:16:36  nfs/bluebox.lan@LAN
    Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

which seems sane, a ticket for bluescreen.lan trying to access bluebox.lan. So my guess is the nfs server is somehow not accepting this ticket or is not able to validate it.

Furthermore:

  • hostname -f shows both machines know their full names (including .lan).
  • getent <name-of-other-machine> resolves to the correct ip address.
  • getent <ip-address-of-other-machine> resolves to the correct hostname.
  • I'm running systemd-timesyncd on both machines so their clocks are in sync
  • I'm running NixOS 19.09
  • kernel 5.3.7 on both hosts
  • nfs-utils 2.4.1
  • mit kerberos 1.17

Now my question: how should I debug this any further? I played a bit with the rpcdebug command to get some logging flowing into the journal on both ends (rpc, nfs, nfsd components), but it's a lot to go through with no clear errors popping out.

Edit 2019-11-03: Found out NixOS NFS server is to blame. NixOS client with Debian 10 server works fine. Debian client with NixOS server has same issue as NixOS client has. No solutions yet though. Tagging with NixOS now as it clearly has something to do with my issue.

Mathijs Kwik
  • 121
  • 2

0 Answers0