12

I had an issue some time ago with a server in which Apache and Snort was occupying 100% of the processor, making the sshd unresponsive through remote access. I had to go physically to the server to log on a local TTY and then stop apache/snort.

I'm wondering if there's a way of guaranteeing ssh connectivity on a situation of 100% loaded CPU/memory. Setting a "nice" priority would be enough?

Thank you!

Renato Todorov
  • 233
  • 1
  • 10

3 Answers3

11

Other than using an out-of-band method, there's no way to guarantee that SSH will be available on a fully loaded server. If your service is so loaded that it can't even serve a basic SSH terminal to you, then you have other problems.

Yes, renice and giving it a lower nice value will improve the performance in heavy loads, but instead using something like pam_security (example shown here) will prevent Apache/whatever from becoming unmanageable to begin with.

Nathan C
  • 14,901
  • 4
  • 42
  • 62
  • Right. He's seeking to treat the symptom, not the real problem. – ewwhite Sep 03 '13 at 18:04
  • @ewwhite Exactly. And treating the symptom will just result in chasing your tail trying to figure out why *other* things break as a result. :) – Nathan C Sep 03 '13 at 18:05
  • I'm looking for a way to extinguish the fire but of course I'll set limits for the other daemons. This is for an emergency situation, I need to have the tranquility to know that I'll always have sshd responsive for remote access. – Renato Todorov Sep 03 '13 at 18:09
  • @RenatoTodorov in this case you can't treat the symptom. If your system has a runaway process that consumes all {CPU, RAM, Sockets, PIDs} you can't guarantee that even a `nice`'d culprit will be booted off the CPU fast enough to ensure you have SSH access (or for that matter that any console access you have would be usable). The *underlying issue* (resource-hog) needs to be addressed. Fire-Fighting is poor system management. – voretaq7 Sep 03 '13 at 18:10
  • Do you think rtprio could be usefull for this situation? – Renato Todorov Sep 03 '13 at 18:11
  • @voretaq7 I understand that, you're right, I was thinking on a way to address the issue while it's happening, only on the worse situation. With the pam_security settings this kind of problem should not happen again but... you'll never know... – Renato Todorov Sep 03 '13 at 18:14
  • @RenatoTodorov That's why OOB management is absolutely essential. Most hosts offer some kind of console, even. If you don't have it, buy an IP-KVM. They're expensive(ish), but how much does downtime cost you/your business/whatever you're hosting? – Nathan C Sep 03 '13 at 18:18
  • 1
    Well, you guys convinced me, I'll use iDRAC 7 Express as long as I already have it. Thank you all! – Renato Todorov Sep 03 '13 at 18:22
  • Maybe he could try opening a reverse-SSH tunnel and keeping it open with some kind of keepalive script. There is plenty of tutorials for doing just that on the net. Of course it is a painkiller when clearly an amputation is in order. – dlyk1988 Sep 04 '13 at 23:23
7

Your general-purpose solution for this is an out-of-band management tool, like Dell iDRAC, IBM Remote Supervisor, or HP iLO. It can always present a console (whether or not the OS can respond to it depends on your specific situation), and apply desired power states as needed.

mfinni
  • 35,711
  • 3
  • 50
  • 86
  • Ok, iDRAC is an good option as I'm using Dell servers but I was thinking on a simpler solution, maybe something like reserving CPU for sshd (including it's spawned children), some kind of "QoS" for local services. – Renato Todorov Sep 03 '13 at 17:43
  • Some companies are broke or greed : in such a case, sysrqd can be a cheap alternative to iDRAC, iLO, KVM ... – bgtvfr Oct 03 '17 at 08:16
0

I've had some success giving realtime privilege to the sshd, however that comes at the cost of having to reboot the machine if one of the realtime processes runs away.

So if you want to go down this route, start a second ssh daemon that is only for emergencies. :)

Simon Richter
  • 3,209
  • 17
  • 17
  • 1
    `realtime`ing sshd seems dangerous to me, particularly on port 22 (an SSH scan could become a DoS attack -- running on an alternate port can mitigate that, but I'd still be scared...) – voretaq7 Sep 03 '13 at 20:15
  • not a problem if I block the access from the internet, actually my only route to this server is through VPN. thank you for the suggestion! – Renato Todorov Sep 04 '13 at 00:50