80

Connection to one of my servers using ssh takes more than 20 seconds to initiate.

This is not related to LAN or WAN conditions, since connection to itself takes the same (ssh localhost). After connection is finally establised, it is super fast to interract with the server.

Using -vvv shows that the connection is stuck after saying "pledge: network". At this point, authentication (here using key) is already done, as visible here :

...
debug1: Authentication succeeded (publickey).
Authenticated to myserver.mydomain.com ([xx.xx.xx.xx]:22).
debug1: channel 0: new [client-session]
debug2: channel 0: send open
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: pledge: network

(...stuck here for 15 to 25 seconds...)

debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
debug2: callback start
debug2: fd 3 setting TCP_NODELAY
debug2: client_session2_setup: id 0
...

Server is Ubuntu 16.04. It already happened to me in the past with another server (was Ubuntu 12.04) , nerver found the solution and the problem disapeared after a while...

sshd_config is the default one provided by Ubuntu.

So far I have tried :

  • using -o GSSAPIAuthentication=no in the ssh command
  • using password instead of a key
  • using UsePrivilegeSeparation no instead of yes, in sshd_config
M-Jack
  • 1,326
  • 2
  • 11
  • 15
  • 1
    Usually for me slow SSH connections are DNS problems, might that be the case here? For example, the server may be stuck trying to do a reverse DNS for the client's IP and waiting for that to time out – Eric Renouf Jul 28 '16 at 14:09
  • 1
    Actually no : by default UseDNS is not defined in sshd_config and man page says that this option is "no" by default. – M-Jack Jul 28 '16 at 15:17
  • 4
    Some Googling suggests this can be caused by updating systemd without rebooting. And there was a [systemd update for xenial on July 12](https://launchpad.net/ubuntu/xenial/+source/systemd/+changelog). `systemctl restart systemd-logind` fixes the problem only for a short period of time for me. – Ivan Kozik Aug 15 '16 at 18:19
  • Or if you're seeing `pam_systemd(sshd:session): Failed to create session: Connection timed out` as mentioned in an answer, this might be https://github.com/systemd/systemd/issues/2925 – Ivan Kozik Aug 15 '16 at 19:17
  • I came here having had this problem after an update, and @IvanKozik's suggestion fixed the problem - i.e systemctl restart systemd-logind - so thanks for that. – Paul M Nov 23 '16 at 23:45
  • @EricRenouf If the problem was DNS lookups it would have happened much earlier in the connection. – kasperd Oct 30 '18 at 08:55

14 Answers14

55

This is probably an issue with D-Bus and systemd. If the dbus service is restarted for some reason, you will also need to restart systemd-logind.

You can check if this is the issue by opening the ssh daemon log (on Ubuntu it should be /var/log/auth.log) and check if it has these lines:

sshd[2721]: pam_systemd(sshd:session): Failed to create session: Connection timed out

If yes, just restart systemd-logind service:

systemctl restart systemd-logind

I had this same issue on CentOS 7, because the messagebus was restarted (which is how the D-Bus service is called on CentOS).

  • I tried to restart systemd-logind but after a while it says PolicyKit daemon disconnected from the bus. We are no longer a registered authentication agent. Job for systemd-logind.service failed because a timeout was exceeded. See "systemctl status systemd-logind.service" and "journalctl -xe" for details. – Kun Ren May 12 '17 at 03:49
  • @KunRen you probably need to restart the `polkit` service using `systemctl restart polkit`. – Strahinja Kustudic Jun 27 '17 at 18:57
  • It was that, thanks! – Avio Feb 10 '21 at 18:02
34

found the answer :

changed UsePAM from yes to no in sshd_config file

After restarting the ssh service, the connection is now immediate to the server. On this server, PAM is linked to ldap, so that is probably the reason, even if here I am connecting with a user declared on the server itself, not an LDAP one.

Well, this is more a way to bypass the issue, not really a solution... I have other servers set up the same way that are not having this issue.

Hope this may help someone...

M-Jack
  • 1,326
  • 2
  • 11
  • 15
  • 1
    changing UsePAM to no has other effects. See [this discussion](http://serverfault.com/questions/381967/ssh-login-using-public-key-failed) So I had to set a password to the user, because I got errors like User nagios not allowed because account is locked – M-Jack Jul 28 '16 at 15:22
  • 6
    This is really not a good idea. – Jakuje Jul 28 '16 at 21:21
  • 2
    why ?? any alternative ? – M-Jack Jul 29 '16 at 08:14
  • 10
    The PAM is used for other things around the account management in modern systems. Rather than turning it off, you would better be with investigating what is going on in the PAM stack and why does it take so long. – Jakuje Jul 29 '16 at 08:16
  • 2
    Leaving very commonly unused PAM module *enabled* for SSH access is a security hole. Limiting access to critical services such as SSH from security standpoint is always a good idea for ANY other service too. When you want PAM module to cooperate with SSH? For example: when you need to integrate it with active directory via winbind, when you need two factor auth with google tokens, etc. In other cases (when using passwd and shadow) shutting it off is perfectly safe. Every user of PAM shall see this: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=pam – Michal Sokolowski Jun 15 '18 at 07:05
  • 1
    I did set this to `no` and it fixed my issues, but I re-set it to `yes` because I'm not sure about the side-effects. Which side-effects can there be on a headless server? – Daniel F Nov 20 '18 at 18:39
20

This happened on two of my Fedora 25 servers, and was due to lots of failed SSH login attempts.

(The common suggestions of using GSSAPIAuthentication=no and UseDNS=no, or restarting systemd-logind, made no difference.)

On these servers, /etc/pam.d/postlogin contains:

session     optional      pam_lastlog.so silent noupdate showfailed

The man page for pam_lastlog explains that the showfailed option will:

Display number of failed login attempts and the date of the last failed attempt from btmp.

On these servers, the /var/log/btmp files were enormous due to many failed login attempts. The btmp log files weren't being rotated either.

I installed the logrotate package to ensure the log files will be rotated in future. (On Fedora, the configuration that ships with logrotate handles the rotation of /var/log/btmp.)

I also deleted the enormous btmp log files; as soon as I did this, connecting to the servers was instantaneous again.

Richard Fearn
  • 321
  • 2
  • 6
  • 2
    This solved my problem! Thank you. Nice catch. SSH was taking 5-10 seconds, and now it's less than a blink of an eye. This is on a VM that I've had connected to the public Internet for years. Its firewall rules could probably be tuned slightly better, now that I think of it. To others, this is all I did: `sudo truncate -s 0 /var/log/btmp` - Mine was 2.7G in size. – Carl Bennett Feb 02 '18 at 00:50
  • 1
    This was the issue for me, thanks @CarlBennett – diegoperini Apr 21 '20 at 12:52
  • 1
    This fixed it for me after a long frustrating search. Thank you! – Theron S Jul 07 '20 at 19:14
  • This answer solved my problem. It appears this is a known PAM bug: https://github.com/linux-pam/linux-pam/issues/270. The "showedfail" option enabled by default by many Linux distributions, prints "NNNN failed logins since the last successful login" on every log in, but to calculate this NNNN it needs to go over the entire /var/log/btmp file, which just grows and grows and after a few years, can become enormous and take over a minute to process during each login! – Nadav Har'El Aug 22 '21 at 14:57
16

On Ubuntu 16+ every time I have seen ssh -v XXX@YYY stalling at pledge: network it can be fixed by following the instructions I found here A comprehensive guide to fixing slow SSH logins. Specifically, an optional PAM module that does not appear to be needed is causing the delay.

In /etc/pam.d/common-session on the machine you see slow logins for (ie. the server). Comment out the line session optional pam_systemd.so. That should immediately fix the problem.

This avoids having to completely shut down PAM which cripples login with passwords.

Jonathan Gutow
  • 261
  • 2
  • 3
7

The problem for me (Ubuntu 19.10) was that my:

/etc/pam.d/sshd

# Print the message of the day upon successful login.
# This includes a dynamically generated part from /run/motd.dynamic
# and a static (admin-editable) part from /etc/motd.
session    optional     pam_motd.so  motd=/run/motd.dynamic
session    optional     pam_motd.so noupdate

Commenting the motd settings got me right in.

Walter
  • 243
  • 2
  • 6
5

For me this issue is caused by large (hundreds of MBs) btmp file. This file logs login attempts. When people are trying to brute force your password this file can be big and cause delays in the "pledge: network" phase.

Try to clear log file

echo "" > /var/log/btmp

and see if it helps.

Tamas Foldi
  • 103
  • 1
Marek Nagy
  • 71
  • 1
  • 1
  • 4
    This needs a lot more explanation. For starters, why do you think this is helpful? – Sven Jun 15 '17 at 11:05
  • tip: Just typing `:> /var/log/btmp` does the same btw. – Marius Mar 07 '19 at 18:37
  • 1
    Sven, a few years after Marek's answer it is now listed as a PAM bug: github.com/linux-pam/linux-pam/issues/270. The "showedfail" option enabled by default by many Linux distributions, prints "NNNN failed logins since the last successful login" on every log in. But to calculate this NNNN it needs to go over the entire /var/log/btmp file, which just grows and grows and after a few years, can become enormous and take over a minute to process during each login! Sad, but true. I had the same problem and removing /var/log/btmp fixed it. – Nadav Har'El Aug 22 '21 at 14:59
3

For me the first clue was provided by

UseDNS no

to the /etc/ssh/sshd_config and then of course service ssh restart (on our Debian/Jessie server).

before:

ssh git@git.*****.de true  0.03s user 0.01s system 0% cpu 13.440 total
ssh git@git.*****.de true  0.03s user 0.01s system 0% cpu 20.990 total
ssh git@git.*****.de true  0.03s user 0.02s system 0% cpu 31.114 total
ssh git@git.*****.de true  0.03s user 0.01s system 0% cpu 25.898 total

after:

ssh git@git.*****.de true  0.03s user 0.02s system 5% cpu 0.832 total
ssh git@git.*****.de true  0.03s user 0.01s system 7% cpu 0.523 total
ssh git@git.*****.de true  0.03s user 0.01s system 7% cpu 0.574 total

This revealed that my DNS configuration was wrong (I had a typo in the DNS address). After fixing the IP and restoring the setting UseDNS yes everything worked fine.

tamasgal
  • 195
  • 1
  • 9
  • 1
    No, adding `UseDNS no` is a solution for a completely different problem. – kasperd Oct 30 '18 at 08:57
  • 1
    @kasperd It doesn't matter. In my case I had the very same symptoms (briefly: stuck after saying "pledge: network") and this is what finally helped, so this is a solution to at least a very similar problem and I am sure it will help one or the other at some point. – tamasgal Oct 30 '18 at 12:12
  • Same here, two hangs during connection, one after `sign_and_send_pubkey`, a longer one after `pledge: network`. Adding only `UseDNS no` with subsequent `service ssh restart` did resolve the problem on an old Ubuntu 14.04.5 LTS installation here. – Hound Mar 26 '19 at 10:01
  • I forgot to follow up: it was eventually a typo in the DNS address in my configuration. Correcting that and setting `UseDNS yes` again fixed. So @kasperd was of course right! The root of the problem was deeper and the fix was not targeting the main issue. – tamasgal Jul 12 '20 at 09:43
2

In my case the reason was a crashed rsyslogd. I found this out because there were no more log-messages in e.g. /var/log/syslog or /var/log/mail.log

So service rsyslog restart resolved the problem for us.

randomcontrol
  • 231
  • 2
  • 4
  • Same cause on a server of ours running CentOS 6.10. Restart of rsyslog took care of it. The thing is, it wasn't dead. It was running, but apparently doing nothing useful. – UtahJarhead Oct 24 '18 at 19:02
1

In my case it was a firewall problem after upgrade to debian 11.

Solved by adding:

iptables -t filter -A INPUT -i lo -j ACCEPT

At the beggining of the firewall script.

0

In my case, it is because there are too many logs. You can test if you are in the case, by issuing this command:

sudo journalctl --list-boots

If it takes a while to give results, and give many lines of the result, then, you are in.

To truncate the logs, do this:

sudo journalctl --vacuum-time 2d

It will delete logs which are older than two days.

0

I got the same issue and we had configured the SSSD service for AD login.

Stopping the SSSD service fixed the issue for me.

Reni
  • 1
0

If you are using NIS/YP, make sure nscd is running. In my case, I was getting the following when ssh in.

systemd-logind[30061]: do_ypcall: clnt_call: RPC: Unable to send; errno = Operation not permitted

According to https://github.com/systemd/systemd/issues/7074, nscd should be running to ensure smooth operation.

After I started nscd (which was not running since I upgraded Ubuntu) incoming ssh sessions are much faster.

stex
  • 1
0

I noticed the following line in my debug feedback:

Control socket connect(/var/lib/jenkins/.ssh/USER@HOST:22): Permission denied

Which was a file that was owned by root:root while I'm jenkins. Removing this file resolved my issues.

Ambidex
  • 141
  • 6
-1

In my case the reason was a crashed rsyslogd. I found this out because there were no more log entries in /var/log/secure

So I Restarted service rsyslog restart resolved the problem for us.

  • 1
    This does not provide an answer to the question. Once you have sufficient [reputation](https://serverfault.com/help/whats-reputation) you will be able to [comment on any post](https://serverfault.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/501692) – djdomi Nov 04 '21 at 17:59