I have a Solaris 11.1 zone named "AZone". I have the zone joined to an Active Directory domain (Windows Server 2008) with the goal of users being able to use their AD account to log in to the Solaris zone. After a lot of headaches and monkeying around with kerberos, LDAP, and PAM, I actually got it working! However, I'm having an issue where logging in as an AD user will sometimes fail...

I have a dummy user set up on AD named "jpublic". I know that jpublic can successfully log in to AZone most of the time (I'm using SSH). However, after a period of no AD users logging in to AZone, a login attempt by jpublic will fail the first time. Then, if I immediately try to log in again as jpublic, it works.

I have PAM logging turned on, which is giving me my only lead on the issue so far. In the authorization log, on the failed login, I get these three messages, followed by the normal PAM loading messages:

Nov 11 10:29:02 azone sshd[2923]: [ID 800047 auth.info] Illegal user jpublic from
Nov 11 10:29:02 azone sshd[2923]: [ID 800047 auth.info] input_userauth_request: illegal user jpublic
Nov 11 10:29:02 azone sshd[2923]: [ID 800047 auth.info] Failed none for <invalid username> from port 57148 ssh2
Nov 11 10:29:02 azone sshd[2923]: [ID 604530 auth.debug] PAM[2923]: pam_start(sshd-kbdint,jpublic,7178c:604e08) - debug = 1
Nov 11 10:29:02 azone sshd[2923]: [ID 713382 auth.debug] PAM[2923]: pam_set_item(604e08:service)

On a successful login, I don't see the auth.info messages, but I just see the pam_start call, and eventually the login succeeds.

Any ideas on what is causing the login to fail the first time around? It seems like something is failing to hit a cached value the first time around, but the process of trying to log in is causing a cache to be refreshed, so the necessary data is available the second time around. Could it be something with the kerberos tickets being refreshed? Something with the ldap_cachemgr daemon? I'll admit I only have a loose understanding of how this all works, so any hints on how to troubleshoot this would be helpful. Also, i can post PAM config files or outputs of commands at someone's suggestion if they would be useful.

DBA Josh
  • 11
  • 2

1 Answers1


This really isn't a kerberos problem at all. It's a cache problem with whatever you are using for the user database on the system.

If you were really old school and just using entries in /etc/passwd, then the illegal user message would mean that that user did not exist in the password file.

I don't have access to Solaris 11 and I know it's quite different from Solaris 10. On a linux machine or solaris 10 I would tell you to look in /etc/nsswitch.conf and see what is backing the passwd file. I've no idea what that is on Solaris 11.

  • I'm guessing that you are correct, that it's not a kerberos problem specifically.Solaris 11 uses the same system that was in /etc/nsswitch.conf, it's just that the configuration is loaded into a service now (svc:/system/name-service/switch) instead of reading from the file on disk. In the configuration, the passwd entry is "files ldap". So I don't know if the first time around ldap doesn't respond, so it fails, but then the second time around it's "up" so it does respond and validate the name? What I really need is to understand how I can troubleshoot the username lookup... – DBA Josh Nov 15 '13 at 22:37
  • Any program that calls getpwent should work as a test. `($name,$passwd,$uid,$gid, $quota,$comment,$gcos,$dir,$shell,$expire) = getpwent($troublename) ;` run that in perl and use truss on it. I'd also look into what you can get out of dtrace. – Fred the Magic Wonder Dog Nov 16 '13 at 01:10