Intermittent Login Failure to RHEL 6.8 Host in an Active Directory Domain Environment

0

I am working in an Active Directory domain of Windows 10 machines. We also have three Linux hosts. Two are running RHEL 6.9, and one is running RHEL 6.8.

We do not use local accounts. All accounts for human users are Active Directory domain accounts.

I have administrative control over the three Linux hosts. I do not have administrative control over any of the Windows hosts or over the Domain Controllers.

Our SSH client on Linux is OpenSSH 5.3.p1. Our SSH client on Windows is PuTTY 0.70.

Our SSH server on Linux is OpenSSH 5.3.p1. We do not run an SSH server on the Windows hosts.

Our supported authentication mechanisms are:

publickey,gssapi-keyex,gssapi-with-mic,password,keyboard-interactive

We are running Kerberos, so we typically don't expect to have to enter a password when SSHing to one of the Linux hosts. In other words, we generally expect gssapi-with-mic to be the authentication method that succeeds (as long as the client holds a valid, unexpired Kerberos ticket).

For months, we have experienced problems logging into the Linux host running RHEL 6.8. This problem is intermittent, and it seems that once things start working, they work for a while or, perhaps, for as long as people are actively using the system.

This happens on the RHEL 6.9 hosts also, but much less frequently.

KEY POINT: The most reliable way to (temporarily) work around this problem is to disconnect the machine's Ethernet cable for a few seconds and then reconnect it.

The following observations are my impressions. I've not been able to characterize the problem precisely.

When the problem occurs, it seems that...

When trying to log in directly at the console (i.e. username and password), the login attempt will typically time out or, perhaps, succeed after a lengthy delay.

When trying to SSH to the host from either another Linux host or from a Windows host (i.e. the authentication mechanism that is expected to succeed is gssapi-with-mic, so no password is expected to have to be entered), the login attempt will typically time out, succeed after a lengthy delay, or revert to password-based authentication. (I've seen all three.)

In the case of reverting to password-based authentication, the login attempt will typically time out or, perhaps, succeed after a lengthy delay (as described above for console-based login attempts). I don't believe I've seen the reversion to password-based login on the RHEL 6.9 machines. I believe this occurs only on the RHEL 6.8 machine, though I am not certain.

In capturing PuTTY client-side logs, I've observed the following message after gssapi-with-mic fails:

Event Log: Internal SSPI error

As I write this, the problem is not currently reproducing. Therefore, at the moment, I do not have ssh -vvv debug output to share. I will likely be able to provide this tomorrow morning if nobody has pointed at the problem before then. (The problem is most often seen in the early morning, presumably because people have not been using the system.)

I've also observed the following so I will share it, but I have a feeling it doesn't have a bearing on the problem. I think (but could be wrong) that these symptoms were due to network or domain controller problems that where occurring just at the moment I saw them.

Just a very few times (and not in a while), I've seen kinit fail with these errors:

  • Cannot reach the KDC
  • Preauthentication failed while getting initial credentials
  • Clients credentials have been revoked while getting initial credentials

To me, this feels like a client-side issue. Does the community agree?

If so, how do you recommend I proceed in isolating and fixing the problem?

If I've left out key information that should be able to be obtained from the client side, please advise and I will provide it.

If needed, I may be able to get a very limited amount of server side support from IT in terms of finding out configuration details, checking of debug logs, etc. However, our IT staff is really in a crunch right now, so I need to use whatever cycles I may be able to get out of them very judiciously.

Your advice and recommendations are appreciated.

Dave

Posted 2019-09-03T18:33:38.903

Reputation: 597

No answers