186

I suspect that one or more of my servers is compromised by a hacker, virus, or other mechanism:

  • What are my first steps? When I arrive on site should I disconnect the server, preserve "evidence", are there other initial considerations?
  • How do I go about getting services back online?
  • How do I prevent the same thing from happening immediately again?
  • Are there best practices or methodologies for learning from this incident?
  • If I wanted to put a Incident Response Plan together, where would I start? Should this be part of my Disaster Recovery or Business Continuity Planning?

This is meant to be a canonical post for this topic. Originally from serverfault.

Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196

6 Answers6

171

Originally from serverfault. Thanks to Robert Moir (RobM)

It's hard to give specific advice from what you've posted here but I do have some generic advice based on a post I wrote ages ago back when I could still be bothered to blog.

Don't Panic

First things first, there are no "quick fixes" other than restoring your system from a backup taken prior to the intrusion, and this has at least two problems.

  1. It's difficult to pinpoint when the intrusion happened.
  2. It doesn't help you close the "hole" that allowed them to break in last time, nor deal with the consequences of any "data theft" that may also have taken place.

This question keeps being asked repeatedly by the victims of hackers breaking into their web server. The answers very rarely change, but people keep asking the question. I'm not sure why. Perhaps people just don't like the answers they've seen when searching for help, or they can't find someone they trust to give them advice. Or perhaps people read an answer to this question and focus too much on the 5% of why their case is special and different from the answers they can find online and miss the 95% of the question and answer where their case is near enough the same as the one they read online.

That brings me to the first important nugget of information. I really do appreciate that you are a special unique snowflake. I appreciate that your website is too, as it's a reflection of you and your business or at the very least, your hard work on behalf of an employer. But to someone on the outside looking in, whether a computer security person looking at the problem to try and help you or even the attacker himself, it is very likely that your problem will be at least 95% identical to every other case they've ever looked at.

Don't take the attack personally, and don't take personally the recommendations that follow here or that you get from other people. If you are reading this after just becoming the victim of a website hack then I really am sorry, and I really hope you can find something helpful here, but this is not the time to let your ego get in the way of what you need to do.

You have just found out that your server(s) got hacked. Now what?

Do not panic. Absolutely do not act in haste, and absolutely do not try and pretend things never happened and not act at all.

First: understand that the disaster has already happened. This is not the time for denial; it is the time to accept what has happened, to be realistic about it, and to take steps to manage the consequences of the impact.

Some of these steps are going to hurt, and (unless your website holds a copy of my details) I really don't care if you ignore all or some of these steps, but doing so will make things better in the end. The medicine might taste awful but sometimes you have to overlook that if you really want the cure to work.

Stop the problem from becoming worse than it already is:

  1. The first thing you should do is disconnect the affected systems from the Internet. Whatever other problems you have, leaving the system connected to the web will only allow the attack to continue. I mean this quite literally; get someone to physically visit the server and unplug network cables if that is what it takes, but disconnect the victim from its muggers before you try to do anything else.
  2. Change all your passwords for all accounts on all computers that are on the same network as the compromised systems. No really. All accounts. All computers. Yes, you're right, this might be overkill; on the other hand, it might not. You don't know either way, do you?
  3. Check your other systems. Pay special attention to other Internet facing services, and to those that hold financial or other commercially sensitive data.
  4. If the system holds anyone's personal data, immediately inform the person responsible for data protection (if that's not you) and URGE a full disclosure. I know this one is tough. I know this one is going to hurt. I know that many businesses want to sweep this kind of problem under the carpet but the business is going to have to deal with it - and needs to do so with an eye on any and all relevant privacy laws.

However annoyed your customers might be to have you tell them about a problem, they'll be far more annoyed if you don't tell them, and they only find out for themselves after someone charges $8,000 worth of goods using the credit card details they stole from your site.

Remember what I said previously? The bad thing has already happened. The only question now is how well you deal with it.

Understand the problem fully:

  1. Do NOT put the affected systems back online until this stage is fully complete, unless you want to be the person whose post was the tipping point for me actually deciding to write this article. I'm not going to link to that post so that people can get a cheap laugh, but the real tragedy is when people fail to learn from their mistakes.
  2. Examine the 'attacked' systems to understand how the attacks succeeded in compromising your security. Make every effort to find out where the attacks "came from", so that you understand what problems you have and need to address to make your system safe in the future.
  3. Examine the 'attacked' systems again, this time to understand where the attacks went, so that you understand what systems were compromised in the attack. Ensure you follow up any pointers that suggest compromised systems could become a springboard to attack your systems further.
  4. Ensure the "gateways" used in any and all attacks are fully understood, so that you may begin to close them properly. (e.g. if your systems were compromised by a SQL injection attack, then not only do you need to close the particular flawed line of code that they broke in by, you would want to audit all of your code to see if the same type of mistake was made elsewhere).
  5. Understand that attacks might succeed because of more than one flaw. Often, attacks succeed not through finding one major bug in a system but by stringing together several issues (sometimes minor and trivial by themselves) to compromise a system. For example, using SQL injection attacks to send commands to a database server, discovering the website/application you're attacking is running in the context of an administrative user and using the rights of that account as a stepping-stone to compromise other parts of a system. Or as hackers like to call it: "another day in the office taking advantage of common mistakes people make".

Why not just "repair" the exploit or rootkit you've detected and put the system back online?

In situations like this the problem is that you don't have control of that system any more. It's not your computer any more.

The only way to be certain that you've got control of the system is to rebuild the system. While there's a lot of value in finding and fixing the exploit used to break into the system, you can't be sure about what else has been done to the system once the intruders gained control (indeed, it's not unheard of for hackers that recruit systems into a botnet to patch the exploits they used themselves, to safeguard "their" new computer from other hackers, as well as installing their rootkit).

Make a plan for recovery and to bring your website back online and stick to it:

Nobody wants to be offline for longer than they have to be. That's a given. If this website is a revenue generating mechanism then the pressure to bring it back online quickly will be intense. Even if the only thing at stake is your / your company's reputation, this is still going generate a lot of pressure to put things back up quickly.

However, don't give in to the temptation to go back online too quickly. Instead move as fast as possible to understand what caused the problem and to solve it before you go back online or else you will almost certainly fall victim to an intrusion once again, and remember, "to get hacked once can be classed as misfortune; to get hacked again straight afterward looks like carelessness" (with apologies to Oscar Wilde).

  1. I'm assuming you've understood all the issues that led to the successful intrusion in the first place before you even start this section. I don't want to overstate the case but if you haven't done that first then you really do need to. Sorry.
  2. Never pay blackmail / protection money. This is the sign of an easy mark and you don't want that phrase ever used to describe you.
  3. Don't be tempted to put the same server(s) back online without a full rebuild. It should be far quicker to build a new box or "nuke the server from orbit and do a clean install" on the old hardware than it would be to audit every single corner of the old system to make sure it is clean before putting it back online again. If you disagree with that then you probably don't know what it really means to ensure a system is fully cleaned, or your website deployment procedures are an unholy mess. You presumably have backups and test deployments of your site that you can just use to build the live site, and if you don't then being hacked is not your biggest problem.
  4. Be very careful about re-using data that was "live" on the system at the time of the hack. I won't say "never ever do it" because you'll just ignore me, but frankly I think you do need to consider the consequences of keeping data around when you know you cannot guarantee its integrity. Ideally, you should restore this from a backup made prior to the intrusion. If you cannot or will not do that, you should be very careful with that data because it's tainted. You should especially be aware of the consequences to others if this data belongs to customers or site visitors rather than directly to you.
  5. Monitor the system(s) carefully. You should resolve to do this as an ongoing process in the future (more below) but you take extra pains to be vigilant during the period immediately following your site coming back online. The intruders will almost certainly be back, and if you can spot them trying to break in again you will certainly be able to see quickly if you really have closed all the holes they used before plus any they made for themselves, and you might gather useful information you can pass on to your local law enforcement.

Reducing the risk in the future.

The first thing you need to understand is that security is a process that you have to apply throughout the entire life-cycle of designing, deploying and maintaining an Internet-facing system, not something you can slap a few layers over your code afterwards like cheap paint. To be properly secure, a service and an application need to be designed from the start with this in mind as one of the major goals of the project. I realise that's boring and you've heard it all before and that I "just don't realise the pressure man" of getting your beta web2.0 (beta) service into beta status on the web, but the fact is that this keeps getting repeated because it was true the first time it was said and it hasn't yet become a lie.

You can't eliminate risk. You shouldn't even try to do that. What you should do however is to understand which security risks are important to you, and understand how to manage and reduce both the impact of the risk and the probability that the risk will occur.

What steps can you take to reduce the probability of an attack being successful?

For example:

  1. Was the flaw that allowed people to break into your site a known bug in vendor code, for which a patch was available? If so, do you need to re-think your approach to how you patch applications on your Internet-facing servers?
  2. Was the flaw that allowed people to break into your site an unknown bug in vendor code, for which a patch was not available? I most certainly do not advocate changing suppliers whenever something like this bites you because they all have their problems and you'll run out of platforms in a year at the most if you take this approach. However, if a system constantly lets you down then you should either migrate to something more robust or at the very least, re-architect your system so that vulnerable components stay wrapped up in cotton wool and as far away as possible from hostile eyes.
  3. Was the flaw a bug in code developed by you (or someone working for you)? If so, do you need to re-think your approach to how you approve code for deployment to your live site? Could the bug have been caught with an improved test system, or with changes to your coding "standard" (for example, while technology is not a panacea, you can reduce the probability of a successful SQL injection attack by using well-documented coding techniques).
  4. Was the flaw due to a problem with how the server or application software was deployed? If so, are you using automated procedures to build and deploy servers where possible? These are a great help in maintaining a consistent "baseline" state on all your servers, minimising the amount of custom work that has to be done on each one and hence hopefully minimising the opportunity for a mistake to be made. Same goes with code deployment - if you require something "special" to be done to deploy the latest version of your web app then try hard to automate it and ensure it always is done in a consistent manner.
  5. Could the intrusion have been caught earlier with better monitoring of your systems? Of course, 24-hour monitoring or an "on call" system for your staff might not be cost effective, but there are companies out there who can monitor your web facing services for you and alert you in the event of a problem. You might decide you can't afford this or don't need it and that's just fine... just take it into consideration.
  6. Use tools such as tripwire and nessus where appropriate - but don't just use them blindly because I said so. Take the time to learn how to use a few good security tools that are appropriate to your environment, keep these tools updated and use them on a regular basis.
  7. Consider hiring security experts to 'audit' your website security on a regular basis. Again, you might decide you can't afford this or don't need it and that's just fine... just take it into consideration.

What steps can you take to reduce the consequences of a successful attack?

If you decide that the "risk" of the lower floor of your home flooding is high, but not high enough to warrant moving, you should at least move the irreplaceable family heirlooms upstairs. Right?

  1. Can you reduce the amount of services directly exposed to the Internet? Can you maintain some kind of gap between your internal services and your Internet-facing services? This ensures that even if your external systems are compromised the chances of using this as a springboard to attack your internal systems are limited.
  2. Are you storing information you don't need to store? Are you storing such information "online" when it could be archived somewhere else. There are two points to this part; the obvious one is that people cannot steal information from you that you don't have, and the second point is that the less you store, the less you need to maintain and code for, and so there are fewer chances for bugs to slip into your code or systems design.
  3. Are you using "least access" principles for your web app? If users only need to read from a database, then make sure the account the web app uses to service this only has read access, don't allow it write access and certainly not system-level access.
  4. If you're not very experienced at something and it is not central to your business, consider outsourcing it. In other words, if you run a small website talking about writing desktop application code and decide to start selling small desktop applications from the site then consider "outsourcing" your credit card order system to someone like Paypal.
  5. If at all possible, make practicing recovery from compromised systems part of your Disaster Recovery plan. This is arguably just another "disaster scenario" that you could encounter, simply one with its own set of problems and issues that are distinct from the usual 'server room caught fire'/'was invaded by giant server eating furbies' kind of thing.

... And finally

I've probably left out no end of stuff that others consider important, but the steps above should at least help you start sorting things out if you are unlucky enough to fall victim to hackers.

Above all: Don't panic. Think before you act. Act firmly once you've made a decision, and leave a comment below if you have something to add to my list of steps.

Lucas Kauffman
  • 54,169
  • 17
  • 112
  • 196
  • 12
    I hope this is not understood the wrong way, but Rob has an account here, it would've made much more sense to ask _him_ to post his answer here. You know, being a canonical answer for all of our future "My server has been haxored" questions. – Adi Jul 19 '13 at 18:24
  • If RobM wants he can copy paste it and I'll remove mine :) – Lucas Kauffman Jul 19 '13 at 18:34
13

Making a file "undeletable" in Linux is done with attributes, specifically the "immutable" attribute. See lsattr to see attributes, chattr to change them.

However, this only answers to the proximal cause. The important thing is that your machine was put through hostile control, and the hijacker installed things for his own devious goals. In particular, he most probably installed a rootkit in order to keep the entry open despite cleansing attempts like what you are trying to do. Rootkits may have altered the kernel and/or the system binaries in ways that will not be visible from the machine itself, and which will prevent their own removal. The bottom-line is that your machine cannot be saved; there is no way you can reliably make your machine clean again, save by reformatting the disk and reinstalling from scratch.

Save yourself from future worries and headaches; nuke your system from orbit.

Tom Leek
  • 168,808
  • 28
  • 337
  • 475
  • I had rkhunter installed on one of the machines, but it did not raise any alarms either before or after my delete operations. Maybe its just a useless tool. Also, I dont see anything suspicious running when I reboot the machines and look at output of top and ps. I will look into the attribute thing now. – xkcd Apr 07 '14 at 17:58
  • Thanks Tom, the *chattr* command worked and I was able to make the files deletable, and then deleted them :). Will follow the nuke advice shortly anyway. Thanks. – xkcd Apr 07 '14 at 20:24
8

Like I was saying in the reply to the cross-post from ServerFault. That it is a good explanation. Also, it certainly depends on the type of attack; hopefully or unfortunately, this attack is noisy enough that you recognized it as an attack in process. Or, at what you can be reasonably certain are the early stages of an attack, I would say this order of operations is a good blueprint to follow.

But, the possibility exists that the indicators of compromise you are aware of, may not paint the entire picture of infection and disconnecting that PC may not be in your best interests until you understand the extent, I argue that it could be better to find out points of entry and what systems you may not have control of before you start to remove any affected systems/devices from the network.

The truth of the matter may be that those actors have been in your systems for longer than you thought and showing your hand too early (i.e. I have finally noticed you sitting in my systems) could make it much more difficult to eradicate.

There is no real easy answer, but the one provided by RobM is a more than adequate starting spot. With all the pressures, there are multiple right answers that could just as well be wrong answers. Almost like the uncertainty principle applies, you don't know if the answer will be right exactly until you try it.

Also the (PDF) NIST Computer Security Incident Handling Guide should be reviewed as well.

M15K
  • 1,182
  • 6
  • 7
  • 1
    *disconnecting that PC may not be in your best interests until you understand the extent,* I'd agree with this actually, though my reply was written for people who, if they have to ask for the sort of help my answer offered, probably aren't equipped to make this decision themselves, because as you know, the choice you're suggesting has to balance what the victim knows about the current situation vs. an unknown future. – Rob Moir Sep 06 '13 at 07:29
  • Yes, if they are making a decision based on some certain risk factors or criteria, I am all for that. I would just hope to emphasis that panicking and hastily unplugging may do more harm than good. – M15K Sep 06 '13 at 14:14
  • 1
    that url points to nothing, can we get that updated please! – l4t3nc1 Jul 17 '17 at 14:37
  • Just edited the URL. Sorry didn't see this request before. – M15K Aug 04 '17 at 15:04
2

Backup it all up - so you can conduct through forensics in a sandbox type environment.

Then - start from 0 - yes from NOTHING.

New O/S - fully patched. Applications - current up-to-date Data Files.... from your backups (Prior to when you were compromised if possible). If not you need to content and permissions scan EVERYTHING.

Then conduct a through review of how the hackers got in, and ensure it does not happen again.

When you find out what has happened - take steps to ensure it can not occur again, and publish (internally and/or externally) what you have found (You may choose to omit the countermeasures).

Unless you learn from this - you will suffer the same fate again.

Tim Seed
  • 333
  • 1
  • 3
  • 2
    Wouldn't it be better to image the drives, rather than backing up select files? Or, by "backing it all up," do you mean "image the drives"? – Mark Buffalo Mar 09 '16 at 16:01
  • 1
    And then you need expertise in performing a forensic investigation (not a typical skillset for a server operator). "Ensure it does not happen again" is a very vague suggestion. This answer is a little light on actionable advice ... – schroeder Mar 09 '16 at 17:15
  • Imaging the drives would be the best option - sorry I did not explicitly say this. Forensic's is I agree a specialist skill - but if you do not take the time to find out how the hackers got in - and simply restore the system from yesterdays backup - Guess what ? They will be back - as you did not plug the gap. I was deliberately vague - as the reasons for the breech will vary so much - un-patched code, passwords, social eng, brute force, bad security model - the list is endless. "The taste of defeat has a richness of experience all its own." Bill Brady – Tim Seed Mar 10 '16 at 04:17
  • 2
    I don't understand why this has so many downvotes. This IS the best course of action. Just removing the malware does nothing. The malware wasn't there before and they still got in. Also expire all passwords on the network immediately, and monitor closely which users are changing them. – Drunken Code Monkey Sep 17 '16 at 16:56
  • @DrunkenCodeMonkey thank you !! Having had to deal with 3 servers that have been hacked in a 30 year IT career, the advice I gave was based on experience not some 10-minute "Anti Hazker" flash/powerpoint demo. Once a hacker is in - you have no idea at all what they have done to your system - added new ssh port, new accounts, new shares, changed some subtle apache rules. Try finding something like that!!! Let them carry on with the possibly compromised server.... It will be their job not mine. – Tim Seed Sep 19 '16 at 14:50
  • Upvoting to offset downvotes - this might not explain all steps fully or be easy for a layperson to follow, but it's succinct and basically correct. – Stilez May 15 '18 at 09:14
1

The methodology is subjected to the exact real-world scenario but below would be a possible approach.

What are my first steps? When I arrive on site should I disconnect the server, preserve "evidence", are there other initial considerations?

Ans: Of course, you must disconnect the server from network but should not power down/shutdown the server, as you may have to do forensic to understand the situation, impact of the incident and must preserve evidences (data on your memory may be erased if you shut down the server).

Crisis Management and Communication – If you have a crisis management and DR/BCP policy follow the procedures as stated. This is again subjected to the scenario in this case this could be a virus propagation, so you may have to follow your process.

How do I go about getting services back online?

Ans: As stated above, follow your crisis management/DR/BCP instructions. This may vary depend on the situation.

For example, if this virus was a time bomb or triggered with some action on server and a Ransomware attack, it’s better not to bring your DR up immediately (this might trigger another malware spread from your DR server). Best method would be to assess the impact of the incident on your network and then take required actions to bring your services back.

How do I prevent the same thing from happening immediately again?

Ans: The root cause of the incident to be determined asap to prevent the same incident happening again. As indicated on above answer, on a Ransomware scenario you may have to ensure the malware is not propagated to your DR/Backup.

If the incident has happened due to a vulnerability in your network (for example a firewall port was opened mistakenly, and the attack came through this port, immediate actions to be taken to close the known/identified vulnerability to avoid the incident happening again.

Are there best practices or methodologies for learning from this incident?

Ans: Yes, every incident could be unique in nature. So, update your crisis management/ DR/ BCP procedures to reflect the learning of such incidents.

It is always recommended to have proactive monitoring/incident identification in place to avoid/early detection of such incidents. For example you may deploy a SOC (Security Operations Center) or SIEM (Security Incident Event Management) tools.

If I wanted to put an Incident Response Plan together, where would I start? Should this be part of my Disaster Recovery or Business Continuity Planning?

Ans: This should be part of your BCP/DR plan and usually covered in crisis management (part of BCP plan).

Sayan
  • 2,033
  • 1
  • 11
  • 21
0

Here's a very important note.

Not only you have to wipe clean your disks whenever you try to start anew, you also have to make sure various ROMs which could have been modified by the attacker are also in a pristine factory state which means that at the very least you will need to reflash UEFI firmware and probably reset its settings in case we are talking about real hardware. This is not an issue for virtualized environments.

ROMs are included in HDD/SSD/RAID/NIC/GPU as well but these ones are very unlikely to have been modified/compromised unless you are targeted by three-letter agencies.

Artem S. Tashkinov
  • 1,389
  • 5
  • 13