I have a working NFS4 setup. The server is called bluebox.lan
and it exports:
/mnt bluescreen.lan(rw,no_root_squash,crossmnt)
The client is called bluescreen.lan
and it is able to mount bluebox
's nfs using:
mount -t nfs4 -o nfsvers=4.2 bluebox.lan:/mnt /bluebox
Now I added kerberos to the mix.
bluebox
is the kdc for realm "LAN".
created these principals:
host/bluebox.lan@LAN
mathijs@LAN
mathijs/admin@LAN
nfs/bluebox.lan@LAN
nfs/bluescreen.lan@LAN
then I added nfs/bluebox.lan
to bluebox's keytab and nfs/bluescreen.lan
to bluescreen's keytab.
I validated kerberos itself works:
$ kinit mathijs@LAN
Password for mathijs@LAN:
$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: mathijs@LAN
Valid starting Expires Service principal
29-10-19 10:20:07 30-10-19 10:20:07 krbtgt/LAN@LAN
Then I changed /etc/exports on bluebox
(only added ,sec=krb5):
/mnt bluescreen.lan(rw,no_root_squash,crossmnt,sec=krb5)
I restarted involved services (nscd
, rpcbind
, rpc-statd
and rpc-gssd
on both machines, plus nfs-server
on bluebox
.
Now I try mounting on bluescreen
:
mount -t nfs4 -o nfsvers=4.2,sec=krb5 bluebox.lan:/mnt /bluebox -vvvvvv
mount.nfs4: timeout set for Tue Oct 29 10:24:54 2019
mount.nfs4: trying text-based options 'nfsvers=4.2,sec=krb5,addr=192.168.22.2,clientaddr=192.168.22.5'
mount.nfs4: mount(2): Permission denied
mount.nfs4: access denied by server while mounting bluebox.lan:/mnt
I get no messages in the journals of either machine.
However, I do notice that kerberos worked in the background to arrange tickets. On bluescreen
(the client) /tmp/krb5ccmachine_LAN
appeared, containing:
# klist -e /tmp/krb5ccmachine_LAN
Ticket cache: FILE:/tmp/krb5ccmachine_LAN
Default principal: nfs/bluescreen.lan@LAN
Valid starting Expires Service principal
29-10-19 10:16:36 30-10-19 10:16:36 krbtgt/LAN@LAN
Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
29-10-19 10:16:36 30-10-19 10:16:36 nfs/bluebox.lan@LAN
Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
which seems sane, a ticket for bluescreen.lan
trying to access bluebox.lan
. So my guess is the nfs server is somehow not accepting this ticket or is not able to validate it.
Furthermore:
hostname -f
shows both machines know their full names (including.lan
).getent <name-of-other-machine>
resolves to the correct ip address.getent <ip-address-of-other-machine>
resolves to the correct hostname.- I'm running systemd-timesyncd on both machines so their clocks are in sync
- I'm running NixOS 19.09
- kernel 5.3.7 on both hosts
- nfs-utils 2.4.1
- mit kerberos 1.17
Now my question:
how should I debug this any further?
I played a bit with the rpcdebug
command to get some logging flowing into the journal on both ends (rpc, nfs, nfsd components), but it's a lot to go through with no clear errors popping out.
Edit 2019-11-03: Found out NixOS NFS server is to blame. NixOS client with Debian 10 server works fine. Debian client with NixOS server has same issue as NixOS client has. No solutions yet though. Tagging with NixOS now as it clearly has something to do with my issue.