I'm setting up a test environment for a customer about to deploy samba4 into 1400 remote sites and I'm running into a problem. It's my job, after all, to run into problems and then solve them.
Active Directory
- forest root & single domain: main.adlab.netdirect.ca
- created on Windows 2008 R2
- 2008 FFL
- 2008 DFL
Main office
- AD1: Windows 2008 R2 DC
- AD2: Windows 2008 R2 DC
- Windows 7 Professional clients
Branch office
- SLES11SP2 (fully updated!) with Samba 4 (4.1.1-7.suse111 packages from sernet)
- Samba 4 configured as RODC
I've configured a password replication policy to allow certain accounts to be cached on the RODC and then populated those accounts to the RODC:
sles-shire:~ # samba-tool rodc preload 'win7-shire$' --server main.adlab.netdirect.ca
Replicating DN CN=WIN7-SHIRE,CN=Computers,DC=main,DC=adlab,DC=netdirect,DC=ca
Exop on[CN=WIN7-SHIRE,CN=Computers,DC=main,DC=adlab,DC=netdirect,DC=ca] objects[1] linked_values[2]
sles-shire:~ # samba-tool rodc preload 'win7-shire-2$' --server main.adlab.netdirect.ca
Replicating DN CN=WIN7-SHIRE-2,CN=Computers,DC=main,DC=adlab,DC=netdirect,DC=ca
Exop on[CN=WIN7-SHIRE-2,CN=Computers,DC=main,DC=adlab,DC=netdirect,DC=ca] objects[1] linked_values[1]
sles-shire:~ # samba-tool rodc preload 'bilbo' --server main.adlab.netdirect.ca
Replicating DN CN=Bilbo Baggins,OU=Shire,OU=Offices,DC=main,DC=adlab,DC=netdirect,DC=ca
Exop on[CN=Bilbo Baggins,OU=Shire,OU=Offices,DC=main,DC=adlab,DC=netdirect,DC=ca] objects[1] linked_values[2]
I know that those credentials are being cached on the RODC since if I drop the site link I can log in with a cached user but not a different user:
michael@sles-shire:~> smbclient //sles-shire.main.adlab.netdirect.ca/sysvol -U michael
Enter michael's password:
session setup failed: NT_STATUS_IO_TIMEOUT
michael@sles-shire:~> smbclient //sles-shire.main.adlab.netdirect.ca/sysvol -U bilbo
Enter bilbo's password:
Domain=[MAIN] OS=[Unix] Server=[Samba 4.1.1-SerNet-SuSE-7.suse111]
smb: \> ls
. D 0 Mon Nov 18 16:09:44 2013
.. D 0 Mon Nov 18 16:11:15 2013
main.adlab.netdirect.ca D 0 Wed Nov 20 17:54:13 2013
So authentication is working fine! But when I try and log into the Windows 7 PC (WIN7-SHIRE) I get the error:
An internal error has occurred.
Gee. Thanks. If I use an incorrect password I get:
The user name or password is incorrect.
So the authentication is happening, but Windows 7 doesn't like something. I see these errors in the event logs and I think they're relevant to this problem:
The Security System detected an authentication error for the server ldap/sles-shire.main.adlab.netdirect.ca. The failure code from authentication protocol Kerberos was "An internal error occurred. (0xc00000e5)".
The Security System detected an authentication error for the server DNS/sles-shire.main.adlab.netdirect.ca. The failure code from authentication protocol Kerberos was "An internal error occurred. (0xc00000e5)".
If I'm already logged on and try and use network services I get:
The Security System detected an authentication error for the server cifs/sles-shire.main.adlab.netdirect.ca. The failure code from authentication protocol Kerberos was "An internal error occurred. (0xc00000e5)".
My krb5.conf on the server:
[libdefaults]
default_realm = MAIN.ADLAB.NETDIRECT.CA
dns_lookup_realm = true
dns_lookup_kdc = true
[realms]
[logging]
kdc = FILE:/var/log/krb5/krb5kdc.log
admin_server = FILE:/var/log/krb5/kadmind.log
default = SYSLOG:NOTICE:DAEMON
Here's the real kicker:
The behaviour still occurs when the site link is up. I can log in to the domain PC with accounts that are not cached on the RODC, but if they're on the RODC I get the same error.
I've ensured that all appropriate SRV records in AD DNS are in place. I've ensured this by promoting a Windows 2008 R2 DC in the branch office to an RODC role and ensuring that all of the appropriate DNS records are present for both the Windows and Samba RODC.
(some were necessary to add by hand as they aren't yet added by samba:
SRV _ldap._tcp.${SITE}._sites.DomainDnsZones.${DNSDOMAIN} ${HOSTNAME} 389
SRV _ldap._tcp.${SITE}._sites.ForestDnsZones.${DNSFOREST} ${HOSTNAME} 389
) (must close bracket)
So… what's broken and how do I fix it?
SPN info
> dsquery * "CN=SLES-SHIRE,OU=Domain Controllers,DC=main,DC=adlab,DC=netdirect,DC=ca" -attr servicePrincipalName
servicePrincipalName
ldap/SLES-SHIRE;
ldap/4116d553-d66b-4c8b-9a60-90380ac69c04._msdcs.main.adlab.netdirect.ca;
ldap/SLES-SHIRE.main.adlab.netdirect.ca/main.adlab.netdirect.ca;
HOST/SLES-SHIRE.main.adlab.netdirect.ca/main.adlab.netdirect.ca;
ldap/SLES-SHIRE.main.adlab.netdirect.ca;
ldap/SLES-SHIRE.main.adlab.netdirect.ca/MAIN;
HOST/SLES-SHIRE.main.adlab.netdirect.ca/MAIN;
RestrictedKrbHost/SLES-SHIRE.main.adlab.netdirect.ca;
RestrictedKrbHost/SLES-SHIRE;
GC/SLES-SHIRE.main.adlab.netdirect.ca/main.adlab.netdirect.ca;
HOST/SLES-SHIRE.main.adlab.netdirect.ca;HOST/SLES-SHIRE;
> dsquery * "CN=WIN7-SHIRE,CN=Computers,DC=main,DC=adlab,DC=netdirect,DC=ca" -attr servicePrincipalName
servicePrincipalName
TERMSRV/WIN7-SHIRE.main.adlab.netdirect.ca;
TERMSRV/WIN7-SHIRE;
RestrictedKrbHost/WIN7-SHIRE;
HOST/WIN7-SHIRE;
RestrictedKrbHost/WIN7-SHIRE.main.adlab.netdirect.ca;
HOST/WIN7-SHIRE.main.adlab.netdirect.ca;