Have got a curious issue with SSH on CentOS 6 and haven't found a solution yet.
We have our CentOS 6 servers all joined to an Active Directory 2012 R2 domain using Winbind. It is not used for filesharing, but single sign-on and group access. Most of the time an initial SSH connection works using Ansible or Putty. This can either be key based or using passwords. However, sometimes we fail to get logged in and the connection will timeout. This does not seem to target any one server, but happens randomly thoughout the environment. All servers are patched and rebooted to the same updates each month, so all running the same versions everywhere.
I've tried some of the obvious answers out there I could find related to this.
- Set "UseDNS no", "AddressFamily inet", "GSSAPIAuthentication no" in /etc/ssh/sshd_config. No change in behavior.
- Set "LogLevel DEBUG2" in /etc/ssh/sshd_config. I don't see any warnings or errors in the output.
- Set "options single-request-reopen" in /etc/resolv.conf. No change in behavior.
- For Winbind, "wbinfo -u", "wbinfo -g", "getent password", etc all work just fine.
- Tried increasing the verbosity of Samba output, but haven't found anything in the logs that would point me in the right direction.
- Watch the Windows Event logs on the Domain controllers, and don't see anything showing up there either.
Prior to Windows 2012 R2, the domain was Windows 2008 and we had the same problem. Users in AD have all the proper UNIX attributes set.
Using Putty we experience the same problem as when we use Ansible. The Ansible server is on the same LAN as all the servers. Reverse DNS works just fine, all servers are pingable. Services on the server are always responding. It's just that first connection sometimes doesn't work. No prompt is shown. It is almost like SSH is "sleeping". It has become more of a nuisance with Ansible and trying to automate various things.
I'm at a loss here as to how to troubleshoot this further.
Does anyone have a suggestion that might be helpful? Posting my configs in case they prove useful.
/etc/samba/smb.conf
[global]
workgroup = COMPANY
netbios name = SERVER01
password server = dc01.company.local dc02.company.local
realm = COMPANY.LOCAL
security = ads
smb encrypt = yes
template shell = /bin/bash
template homedir = /home/%U
winbind nss info = rfc2307
winbind use default domain = true
winbind offline logon = false
winbind enum users = yes
winbind enum groups = yes
idmap config *:backend = tdb
idmap config *:range = 1000000-1999999
idmap config COMPANY:backend = ad
idmap config COMPANY:default = yes
idmap config COMPANY:range = 2048-999999
idmap config COMPANY:schema_mode = rfc2307
server string = Samba Server
log file = /var/log/samba/log.%m
max log size = 5000
log level = 4
passdb backend = tdbsam
load printers = no
printcap name = /dev/null
disable spoolss = yes
[homes]
comment = Home Directories
browseable = no
writable = no
/etc/ssh/sshd_config
AddressFamily inet
Protocol 2
SyslogFacility LOCAL6
LogLevel DEBUG3
LoginGraceTime 60
PermitRootLogin no
PermitEmptyPasswords no
PasswordAuthentication yes
ChallengeResponseAuthentication no
GSSAPIAuthentication no
UsePAM yes
AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
AcceptEnv XMODIFIERS
X11Forwarding no
UseDNS no
Banner /etc/ssh/sshd-banner
Subsystem sftp /usr/libexec/openssh/sftp-server
AllowGroups linuxadmins otheradmin
MACS hmac-sha2-256,hmac-sha1,hmac-sha2-512
Thanks!
Update
All the CentOS servers and Domain Controllers are pointed at NTP servers, so time is synchronized across everything.