0

I want to mount an NFS4 share, but with Kerberos security enabled. This is my setup:

  • Debian Server (dns fqdn: nfsv4test.subnet.example.org)

  • Debian Client (dns fqdn: nfsv4client.subnet.example.org)

  • Windows ADC, acts also as KDC

  • My realm is REALM.EXAMPLE.ORG

  • The subnet where the both Debian machines are located in is called subnet.example.org

  • There is no NAT going on.

  • Both machines are up-to-date.

So as I'm still struggling with Kerberos, that is how I tried to archieve my goal:

Chapter I: Setup

1- Put both machines in the same Realm/Domain (This has already been set up by others and works)

2- Created two users (users, not computers!) per machine: nfs-nfsv4client, host-nfsv4client, nfs-nfsv4test and host-nfsv4test After the creation I enabled AES256 Bit encryption for all of the accounts.

3- Set a service principal for the users:

setspn -S nfs/nfsv4test.realm.example.org@REALM.EXAMPLE.ORG nfs-nfsv4test

I did this for all 4 users/principals.

3- Created the keytabs on the Windows KDC:

ktpass -princ host/nfsv4test.realm.example.org@REALM.EXAMPLE.ORG +rndPass -mapuser host-nfsv4test@REALM.EXAMPLE.ORG -pType KRB5_NT_PRINCIPAL -out c:\temp\host-nfsv4test.keytab -crypto AES256-SHA1

So after that I had 4 keytabs.

4- Merged the keytabs on the server (and client):

ktutil  
read_kt host-nfsv4test.keytab   
read_kt nfs-nfsv4test.keytab    
write_kt /etc/krb5.keytab

The file has 640 permissions.

5- Exported the directories on the server; this has already worked without kerberos. With Kerberos enabled, the export file looks like this:

/srv/kerbnfs4 gss/krb5(rw,sync,fsid=0,crossmnt,no_subtree_check,insecure)
/srv/kerbnfs4/homes gss/krb5(rw,sync,no_subtree_check,insecure)

Running exportfs -rav works:

root@nfsv4test:~# exportfs -rav
exporting gss/krb5:/srv/kerbnfs4/homes
exporting gss/krb5:/srv/kerbnfs4

...and on the client I can view the mounts on the server:

root@nfsv4client:~# showmount -e nfsv4test.subnet.example.org
Export list for nfsv4test.subnet.example.org:
/srv/kerbnfs4/homes gss/krb5
/srv/kerbnfs4       gss/krb5

6a- the krb5.conf has the default config for the enviroment it's was set up for and I havn't changed anything:

[libdefaults]
    ticket_lifetime = 24000
    default_realm = REALM.EXAMPLE.ORG
    default_tgs_entypes = rc4-hmac des-cbc-md5
    default_tkt__enctypes = rc4-hmac des-cbc-md5
    permitted_enctypes = rc4-hmac des-cbc-md5
    dns_lookup_realm = true
    dns_lookup_kdc = true
    dns_fallback = yes

# The following krb5.conf variables are only for MIT Kerberos.
    kdc_timesync = 1
    ccache_type = 4
    forwardable = true
    proxiable = true

# The following libdefaults parameters are only for Heimdal Kerberos.
    fcc-mit-ticketflags = true

[realms]
    REALM.EXAMPLE.ORG = {
        kdc = kdc.realm.example.org
        default_domain = kds.realm.example.org
    }

[domain_realm]
    .realm.example.org = KDC.REALM.EXAMPLE.ORG
    realm.example.org = KDC.REALM.EXAMPLE.ORG

[appdefaults]
pam = {
   debug = false
   ticket_lifetime = 36000
   renew_lifetime = 36000
   forwardable = true
   krb4_convert = false
}

6- Then I set up my sssd.conf like this, but I havn't really understood what's going on here:

[sssd]
domains = realm.example.org
services = nss, pam
config_file_version = 2

[nss]
filter_groups = root
filter_users = root
default_shell = /bin/bash

[pam]
reconnection_retries = 3

[domain/realm.example.org]
krb5_validate = True
krb5_realm = REALM.EXAMPLE.ORG
subdomain_homedir = %o
default_shell = /bin/bash
cache_credentials = True
id_provider = ad
access_provider = ad
chpass_provider = ad
auth_provide = ad
ldap_schema = ad
ad_server = kdc.realm.example.org
ad_hostname = nfsv4test.realm.example.org
ad_domain = realm.example.org
ad_gpo_access_control = permissive
use_fully_qualified_names = False
ad_enable_gc = False

7- idmap.conf on both machines:

[General]

Verbosity = 0
Pipefs-Directory = /run/rpc_pipefs

Domain = realm.example.org

[Mapping]

Nobody-User = nobody
Nobody-Group = nogroup

8- And /etc/default/nfs-common on both machines:

NEED_STATD=yes
NEED_IDMAPD=yes
NEED_GSSD=yes

9- Last but not least, nfs-kernel-server on the server:

RPCNFSDCOUNT=8
RPCNFSDPRIORITY=0
RPCMOUNTDOPTS="--manage-gids --no-nfs-version 3"
NEED_SVCGSSD="yes"
RPCSVCGSSDOPTS=""

10- Then, after rebooting both server and client, I tried to mount the share (as root user):

mount -t nfs4 -o sec=krb5 nfsv4test.subnet.example.org:/srv/kerbnfs4/homes /media/kerbhomes -vvvv 

But sadly, the mount doesn't work. I don't get access. On the first try, it takes quite long and this is the output:

root@nfsv4client:~# mount -t nfs4 -o sec=krb5 nfsv4test.subnet.example.org:/srv/kerbnfs4/homes /media/kerbhomes
mount.nfs4: timeout set for Wed Dec 15 15:38:09 2021
mount.nfs4: trying text-based options 'sec=krb5,vers=4.2,addr=********,clientaddr=*******'
mount.nfs4: mount(2): Permission denied
mount.nfs4: access denied by server while mounting nfsv4test.subnet.example.org:/srv/kerbnfs4/homes

Chapter II: Debugging

For a more detailed log, I ran

rpcdebug -m nfsd -s lockd
rpcdebug -m rpc -s call

on the server but I don't get really that much logs.

However, when trying to mount, syslog tells me that:

Dec  6 11:20:02 testserver kernel: [ 2088.771800] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.771808] svc: svc_authenticate (0)
Dec  6 11:20:02 testserver kernel: [ 2088.771811] svc: calling dispatcher
Dec  6 11:20:02 testserver kernel: [ 2088.771840] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.773222] svc: server 00000000c1c7fb25, pool 0, transport 00000000fc9bd395, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.774697] svc: server 00000000c1c7fb25, pool 0, transport 00000000fc9bd395, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.774705] svc: svc_authenticate (6)
Dec  6 11:20:02 testserver kernel: [ 2088.774711] RPC:       Want update, refage=120, age=0
Dec  6 11:20:02 testserver kernel: [ 2088.774712] svc: svc_process close
[... 7x same message ]
Dec  6 11:20:02 testserver kernel: [ 2088.791514] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.791519] svc: svc_authenticate (1)
Dec  6 11:20:02 testserver kernel: [ 2088.791521] svc: authentication failed (1)
Dec  6 11:20:02 testserver kernel: [ 2088.791538] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.791913] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.791918] svc: svc_authenticate (1)
Dec  6 11:20:02 testserver kernel: [ 2088.791920] svc: authentication failed (1)
Dec  6 11:20:02 testserver kernel: [ 2088.791940] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.792292] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2
Dec  6 11:20:02 testserver kernel: [ 2088.792296] svc: svc_authenticate (1)
Dec  6 11:20:02 testserver kernel: [ 2088.792298] svc: authentication failed (1)
Dec  6 11:20:02 testserver kernel: [ 2088.792316] svc: server 00000000c1c7fb25, pool 0, transport 00000000c5641df0, inuse=2

As this didn't really help me at all, I recorded the traffic with tcpdump, which gives me this:

11:12:02.856200 IP ip-client.740 > ip-server.nfs: Flags [S], seq 763536441, win 65160, options [mss 1460,sackOK,TS val 2364952579 ecr 2826266858,nop,wscale 7], length 0
11:12:02.856295 IP ip-server.nfs > ip-client.740: Flags [S.], seq 2444950221, ack 763536442, win 65160, options [mss 1460,sackOK,TS val 2826266858 ecr 2364952579,nop,wscale 7], length 0
11:12:02.856304 IP ip-client.740 > ip-server.nfs: Flags [.], ack 1, win 510, options [nop,nop,TS val 2364952579 ecr 2826266858], length 0
11:12:02.856324 IP ip-client.740 > ip-server.nfs: Flags [P.], seq 1:245, ack 1, win 510, options [nop,nop,TS val 2364952579 ecr 2826266858], length 244: NFS request xid 4035461122 240 getattr fh 0,2/42
11:12:02.856408 IP ip-server.nfs > ip-client.740: Flags [.], ack 245, win 508, options [nop,nop,TS val 2826266858 ecr 2364952579], length 0
11:12:02.856421 IP ip-server.nfs > ip-client.740: Flags [P.], seq 1:25, ack 245, win 508, options [nop,nop,TS val 2826266858 ecr 2364952579], length 24: NFS reply xid 4035461122 reply ERR 20: Auth Bogus Credentials (seal broken)
11:12:02.856425 IP ip-client.740 > ip-server.nfs: Flags [.], ack 25, win 510, options [nop,nop,TS val 2364952579 ecr 2826266858], length 0
11:12:02.867582 IP ip-client.740 > ip-server.nfs: Flags [F.], seq 245, ack 25, win 510, options [nop,nop,TS val 2364952590 ecr 2826266858], length 0
11:12:02.867751 IP ip-server.nfs > ip-client.740: Flags [F.], seq 25, ack 246, win 508, options [nop,nop,TS val 2826266869 ecr 2364952590], length 0
11:12:02.867759 IP ip-client.740 > ip-server.nfs: Flags [.], ack 26, win 510, options [nop,nop,TS val 2364952590 ecr 2826266869], length 0

(I redacted the real ip addresses)

So the interesting part here is the Auth Bogus (Seal broken)? Is there really something bad or is it just the error which appears when something is wrong? I couldn't find anything helpful about this error on the web.

So to come back to Kerberos itself, the keytab seems to be ok:

root@nfsv4client:~# klist -k -e
Keytab name: FILE:/etc/krb5.keytab
KVNO Principal
---- --------------------------------------------------------------------------
   7 host/nfsv4client.realm.example.org@REALM.EXAMPLE.ORG
   6 nfs/nfsv4client.realm.example.org@REALM.EXAMPLE.ORG

When trying to test the keytab file, it seems to work:

root@nfsv4client:~# kinit -k nfs/nfsv4client.realm.example.org
root@nfsv4client:~#

But on this page it's stated that the keytab should be tested with

kinit -k `hostname -s`$

which resolves to

kinit -k nfsv4client

which doesn't work as no key was found for nfsv4client@REALM.EXAMPLE.ORG. So is the keytab wrong or the test method?

Another log I found on the mounting client machine (in messages):

 nfsv4client kernel: [ 4355.170940] svc: initialising pool 0 for NFSv4 callback
 nfsv4client kernel: [ 4355.170940] nfs_callback_create_svc: service created
 nfsv4client kernel: [ 4355.170941] NFS: create per-net callback data; net=f0000098
 nfsv4client kernel: [ 4355.170942] svc: creating transport tcp-bc[0]
 nfsv4client kernel: [ 4355.171032] nfs_callback_up: service started
 nfsv4client kernel: [ 4355.171033] svc: svc_destroy(NFSv4 callback, 2)
 nfsv4client kernel: [ 4355.171034] NFS: nfs4_discover_server_trunking: testing 'nfsv4test.subnet.example.org'
 nfsv4client kernel: [ 4355.171040] RPC:       new task initialized, procpid 9204
 nfsv4client kernel: [ 4355.171041] RPC:       allocated task 000000006bdb9e01
 nfsv4client kernel: [ 4355.171042] RPC:   110 __rpc_execute flags=0x5280
 nfsv4client kernel: [ 4355.171044] RPC:   110 call_start nfs4 proc EXCHANGE_ID (sync)
 nfsv4client kernel: [ 4355.171045] RPC:   110 call_reserve (status 0)
 nfsv4client kernel: [ 4355.171046] RPC:       wake_up_first(000000005af696f3 "xprt_sending")
 nfsv4client kernel: [ 4355.171047] RPC:   110 reserved req 00000000d1a7d1a4 xid 04f914c3
 nfsv4client kernel: [ 4355.171047] RPC:   110 call_reserveresult (status 0)
 nfsv4client kernel: [ 4355.171048] RPC:   110 call_refresh (status 0)
 nfsv4client kernel: [ 4355.171049] RPC:       gss_create_cred for uid 0, flavor 390004
 nfsv4client kernel: [ 4355.171050] RPC:       gss_create_upcall for uid 0
 nfsv4client kernel: [ 4355.171052] RPC:       __gss_find_upcall found nothing
 nfsv4client kernel: [ 4355.201976] RPC:       __gss_find_upcall found msg 000000000e5abcbc
 nfsv4client kernel: [ 4355.201978] RPC:       gss_fill_context returns error 13
 nfsv4client kernel: [ 4355.201982] RPC:       gss_pipe_downcall returning 16
 nfsv4client kernel: [ 4355.201986] RPC:       gss_create_upcall for uid 0 result -13
 nfsv4client kernel: [ 4355.201987] RPC:   110 call_refreshresult (status -13)
 nfsv4client kernel: [ 4355.201988] RPC:   110 call_refreshresult: refresh creds failed with error -13
 nfsv4client kernel: [ 4355.201989] RPC:   110 return 0, status -13
 nfsv4client kernel: [ 4355.201990] RPC:   110 release task

It's a lot of stuff, but I can't find the meaning of error -13, except that it's Permission Denied.

Chapter III: The question

The principals are there in the keytab. So when the client asks the server about the NFS share and tries to access it, both should have the keys to interact with each other. But for some reason it doesn't work. May it be because of the assignment of the principals to the user accounts?

How can I get this to work? How do I get better infos when debugging? Sorry for the wall of china of text.

PS. I mainly followed this tutorial. It seemed like a perfect match for my enviroment..

Standard
  • 53
  • 5
  • Your machines and users are in REALM.EXAMPLE.ORG presumably, and yet you're trying to use SPNs with SUBNET.EXAMPLE.ORG. Why? Also, what is a "subnet name"? I'm not sure, but it has no bearing on the SPN – Semicolon Dec 15 '21 at 19:14

1 Answers1

0

Turning my comment into an answer...

SUBNET.EXAMPLE.ORG does not actually exist (likely). Your realm/domain/forest is REALM.EXAMPLE.ORG, so every object in that domain has that realm. It appears that subnet.example.org is just something you made up for naming convenience, likely.

So if you wanted to use SUBNET.EXAMPLE.ORG, you would need to have appropriate SRV records for the realm subnet.example.org, they would need to point to the AD Domain controllers, AD would have to be configured to use that as an alias realm (not sure if that's strictly possible with Microsoft's implementation). Also, the connecting client and the domain controllers should resolve the FQDN to the IP and the IP to the FQDN.

I would also remove all of the "short names" from your SPNs. Stick with <service>\<FQDN>, host\<FQDN> or UserPrincipalName

This line in your sssd.conf is invalid. ad_hostname = nfsv4test.subnet.example.org

All computers in the domain realm.example.com have FQDNs of <computername>@realm.example.com. End of story. You can use DNS to resolve the machines with other names, but in AD/LDAP the computer account will only ever by <computername>@realm.example.com


In short, to get this to work promptly, replace subnet.example.org with realm.example.org in everything you've attempted and you should likely be functional.

Semicolon
  • 1,646
  • 7
  • 7
  • `SUBNET.EXAMPLE.ORG` does exists, as the FQDN of the machine is actually nfsv4test.subnet.example.org. At least that's what I get when I run `hostname --fqdn`. And as I wrote above, the KDC can be reached because when I log in to nfv4client with an AD user I will actually get an Default principal (`user@REALM.EXAMPLE.ORG`) and an Service principal (`krbtgt/REALM.EXAMPLE.ORG@REALM.EXAMPLE.ORG`). And since all tutorials/guides I've read said the principals need to have the FQDN in it I used `nfsv4test.subnet.example.org`. – Standard Dec 16 '21 at 08:59
  • You can login to the box because you've correctly specified the realm in your sssd.conf file (krb5_realm = REALM.EXAMPLE.ORG). IN this scenario, I don't think it matters what your machine thinks its FQDN is. If your machine NFSV4TEST is in domain REALM.EXAMPLE.ORG then its FQDN (at least for Kerberos) is de facto nfsv4test.realm.example.org. This was not a guess or theory, this is a fact. – Semicolon Dec 16 '21 at 13:36
  • So I finally got around to testing this with subnet replaced by realm. I deleted and recreated the SPNs, generated new keytabs, imported them (they are kind of working at least), updated the sssd.conf; but the error remains the same (Auth Bogus Credentials (seal broken) in tcpdump, Access denied for the mount command). How can I verify that when mounting the correct key is requested from the keytab? – Standard Jan 20 '22 at 09:29