13

I have an unusual problem I've been trying to diagnose for a while:

  • It's about a Debian server running a custom compile of apache 2.2 with PHP, Red5, MySQL 5.5 (standard binary), sendmail (distro version), and crashplan.

  • Every other day I see a high amount of HTTP requests to random files, mostly images - we're talking upwards a thousand concurrent connections.

  • These requests come from the servers own IP address (!).

  • It's usually a limited set of files that's requested over and over again. I see no real pattern, but it doesn't look like someone scraping information, it looks like a DoS attempt.

  • Cron runs a script that temporarily bans IPs with more than 200 connections, so this is usually curbed before it can get really problematic. After 1-3 10 minute bans, the attack usually stops.

  • This has been going on for months. Since the attacks are caught and curbed, I completely fail to see the point.

  • It's happening at random times and intervals, but usually around morning times UTC.

  • No referer or agent is being sent with these requests.

I've checked the webserver and red5 logs for related requests around the same time, in case a script on the server is abused to send queries to itself, but couldn't find anything. There's nothing in the apache errors logs or syslog around that time. Rkhunter didn't find anything out of the ordinary either. The server doesn't source route packets, so spoofing shouldn't be an option either.

I'm at a complete loss as to the method and the reason. Any ideas what to check would be greatly appreciated. :)

UPDATE: Following Isernis advice, I've prepared a mechanism to catch some information on the next occurrence. This is (a slightly generalized version of) the method: http://pastebin.com/6uSUKVbh

ANSWER: This is a social media site allowing mySpace type profiles utilizing FCK Editor. Since that is a bit of a security nightmare, the profiles posted by users undergo extensive checks, one of which probes links/images posted. For one I did not exclude the sites own domain in these checks and two due to a bug related to redirects, every link or image was hit 10 times instead of once. So when a user with a profile containing extensive linkage to the site itself hit the save button, the site would DoS itself. :P In particular this concerned one user who has a bazillion items in her profile and tends to save often.

Thanks to Iserni for the right idea how to diagnose this issue!

ANSWER EDIT: I was wrong about the bug. She actually does have some images 10 times or more inside the profile. More specifically near 1000 links and images to be checked each save. I didn't see that coming. :P

Mantriur
  • 233
  • 2
  • 9
  • I am highly intrigued by this... – Lex Feb 12 '13 at 18:37
  • 1
    Wait, so you ban IPs with a connection limit, but the attack comes from the server's own IP? So what IP are you banning? The fact that the attacker is spoofing the server's IP isn't that mind-boggling. Probably an attempt to bypass dumb firewalls. Since it's a DoS, they don't care if they get a response. – Cory J Feb 12 '13 at 18:54
  • 3
    Yeah, I'm banning the servers own ip address. :) The important communication happens through localhost or sockets. It's no biggie if the server can't talk to it's own interface. (Banning in this case means it doesn't accept incoming packets from its own IP). If the address was spoofed and the other end didn't receive a response, there would be no TCP handshake, much less a subsequent HTTP request. – Mantriur Feb 12 '13 at 19:04
  • @Mantriur You are right, good point. Couldn't you use a firewall to block anything on the INPUT chain/interface with that source IP permanently? That would at least rule out an external source. – Cory J Feb 12 '13 at 19:33
  • Unfortunately it's a rented server without any firewall, so I can't do that. :( I would assume though that the ISPs routers would block that, but I will inquire. – Mantriur Feb 12 '13 at 19:50
  • But you *can* ban that IP via `.htaccess` (or `Deny From` in `httpd.conf`), can't you? – LSerni Feb 12 '13 at 22:13
  • Sure, but what's the difference if I use htaccess or iptables? – Mantriur Feb 12 '13 at 23:05
  • 1
    To elaborate on that last sloppy comment: I'm looking for both the cause and the intention. I'm not looking to cure the symptom, because that means there is still something on my box I don't understand. That would hurt my ego even more than having to ask for help here. :-D Well that, and it would still pose a security risk. – Mantriur Feb 13 '13 at 00:08

5 Answers5

5

First thing: find out where do these requests come from. It has to be a local process, nothing else is likely to be able to spoof a TCP handshake on a modern Linux platform (nothing, that is, that would then proceed to waste such a feat on requesting random images).

If there are recurrent URLs, you can shadow them behind a RewriteRule so that any such request will actually trigger a script. In the script you can run additional checks to see whether the request is legit (and you will then output the proper headers just as if it was the image the legit client expects), or if it is one of the bogus requests. Against the bogus request you can log e.g. the incoming port. Armed with that, you can query netstat and find out the process. You can also run ps and inspect all active processes in the instant of the bogus request.

I am quite sure that the culprit will prove to be Apache itself (I once had a "cache priming" script go rogue on me due to a vhost modification - I had forgotten putting the script in crontab - and got really weird symptoms, somewhat like yours, until it all came back to me; but your case feels different).

To further refine the scene while containing costs, you can add PID/TID to Apache's CustomLog. Then you will be able to cross-check the requests received from the Apache child gone rogue.

Another possibility is to determine exactly how these requests are made. If through Apache, this means fopen_wrappers, cURL, socket functions, or maybe shell utilities (these should both appear in ps output and result in a much more massive server overload, though). You can prepare a series of script that will:

  • gracefully restart Apache without any changes
  • " " , disabling temporarily one of those functions
  • " " , re-enabling same

After verifying (just to be sure) that a restart does not fix the problem (if it did, it would be a quite different can of worms), you can proceed to temporarily disable - a couple dozen seconds each, no more - one function after another. Suppose that disabling curl results in the system immediate return to normal: then you could restrict investigations to scripts using cURL, and maybe even wrap the cURL function with a logging wrapper.

In case the guilty party turns out not to be Apache, still you will be able to determine what is doing this; then either reinstall the affected program (even if I find it unlikely for any random anomaly to turn a program into a repeat-HTTP-GET-requestor) or inspect its configuration, ancillary data files, scripts, and so on and so forth, looking for any difference from a known clean installation. Since I don't usually believe in gremlins, I expect for some difference to stand out in the end.

LSerni
  • 22,521
  • 4
  • 51
  • 60
  • Great advice, thanks! There's a few images that are almost always part of the attack. I'll replace one of those with a script and see what I can find. :-) I agree that anything beyond a local process seems way too sophisticated for such a non-rewarding goal. AJ Henderson has a point when he says it might just be someone who uses a tool he doesn't understand, but there's just so much a script kiddie tool can do. I'm not really up to speed, but TCP sequence prediction used to be higher magic. :P Will report back! :-) – Mantriur Feb 12 '13 at 23:18
  • I'm using FilesMatch to give a specific png file the php type. Temporarily gave netstat an suid root to allow the webserver user to also log process IDs for all connections. That, plus the logged client port should identify the culprit. Assuming that it will be an apache process, I also activated mod_status and am appending that output as well. Anxiously waiting for the next occurence. :-D – Mantriur Feb 13 '13 at 14:43
  • So... how did it go? Any news? I admit to being *really* curious. – LSerni Feb 14 '13 at 20:19
  • I've updated the question with the solution. Unfortunately it's not spectacular at all, just a case of a complex system plus Alzheimer's. :/ Thanks for your help! :-) – Mantriur Feb 14 '13 at 21:37
  • Heh. Not *so* different from what happened to me, except that I did it all by myself. I'm the one who ought to have plead Alzheimer's :-) – LSerni Feb 15 '13 at 11:06
3

Unix (and Linux) has a wealth of tools for analysing stuff like this. My first stop would be to grab the output of netstat -nap e.g. on my local machine...

Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name
...
tcp        0      0 192.168.0.2:80              192.168.0.2:59875           ESTABLISHED 5281/httpd
...
tcp        0      0 192.168.0.2:59875           192.168.0.2:80              ESTABLISHED 32588/chrome
...

Here I can see than chrome (pid 32588) is connected to port 80 / httpd (pid 5281). Since this is a pre-fork installation of apache, I can get more information about the httpd process by logging %P or by looking in /proc/5281/fd (latter will probably require scripting to grab the data at the time of the request).

This will allow you identify the client process.

The most likely candidates are a badly configured proxy or buggy code.

symcbean
  • 18,278
  • 39
  • 73
2

If this were my server, I'd be running an strace on Apache. Running this on every child process in prefork mode can be quite disk intensive, especially when your server is already being overloaded. You do have to keep an eye on your disk space as well, because if it runs out, Apache stops serving requests.

Make sure you use a snaplength long enough to capture the entire request: -s 400 should do.

If Apache is making requests to itself, any GET string will appear in the strace dumps for two different PIDs: one that made the request and one that received it. In the one that made the request, you want to find the request that it received and was processing when it made the request to itself.

I normally do something like this:

for x in `ps -ef | grep apache | awk '{print $2}'`; do strace -s 2000 -p $x -o trace.$x & done

If you want to limit yourself to a subset of Apache children for performance reasons, add a head in there:

for x in `ps -ef | grep apache | head | awk '{print $2}'`; do strace -s 2000 -p $x -o trace.$x & done

But be aware that this makes it less likely for you to capture what's happening.

Make sure you have two SSH sessions open as all those backgrounded tasks can still write to your session. When you want to stop stracing, either restart Apache or run this in the other one:

 for x in `ps -ef | grep strace | awk '{print $2}'`; do kill $x; done

My gut feeling on this one is a "static" module written in PHP that pre-processes images (resizing them for instance) before sending them to the client and it does this with include($image). If $image happens to contain an image URL from your own site rather than a file path from the local filesystem, recursive requests are the result.

It could be using the curl() functions rather than include().

Ladadadada
  • 5,163
  • 1
  • 24
  • 41
  • That's a pretty busy server. I don't think I have the luxury of tracing every apache child. :( The reason I was ruling out a vulnerable/buggy PHP script is that I would expect to see a request to another script around the same time, most notably always the same script, but there's nothing out of the ordinary in the logs. :( – Mantriur Feb 12 '13 at 20:00
  • But yeah, some sort of include, curl, file_get_contents was my first thought, too. But all that's running on that machine is my own code and I always work with the file system directly ... There is one script that acts as a proxy (intentionally) that could cause this, but there are no calls to it in the logs at that time. – Mantriur Feb 12 '13 at 20:02
  • My second thought was some apache misconfiguration (I just recently switched to the 2.x line, I'm a 1.3 guy :), maybe allowing some sort of proxying that isn't logged or is logged in odd places. But none of the proxy modules is being loaded. :( – Mantriur Feb 12 '13 at 20:23
  • Guessing ends up being counter-productive if you do it for long enough. Run the strace (or at least a limited strace) and see what one of these requests is actually doing. If you have suspicions about some of your PHP code, you could always stick some "live debugging" code inline and have it write out to a log file instead of running strace. – Ladadadada Feb 12 '13 at 20:58
  • If another script causes this, wouldn't that script be called the same time these requests happen and show up in the access log? We're talking about 100 log entries per second with very few other and completely random requests happening at the same time. – Mantriur Feb 12 '13 at 21:02
  • 1
    If you're using `mod_rewrite`, the request and the log entry can both look like `GET /images/foo.jpg HTTP/1.1` but the script that is processed sees the URL as `/image.php?image=foo.jpg` because it has been rewritten. If it saw `/image.php?image=http://example.com/images/foo.jpg` it would still try to `include()` that URL instead of a file. – Ladadadada Feb 12 '13 at 21:07
  • I'll need to add that trace to the script that blocks the address when it happens. Do you think it's actually realistic to run this with several thousand busy apache children? :( – Mantriur Feb 12 '13 at 21:07
  • I don't think rewrite is involved there. They hit both PHP files and images which are physically present and not being rewritten. I haven't confirmed this, but it is likely that the image urls are taken from the php script that's requested. They fit the folders .... – Mantriur Feb 12 '13 at 21:12
  • Wait, even if this was mod rewrite, there'd still be two requests, no? One from the original client, one causing the request from the server IP. – Mantriur Feb 12 '13 at 21:23
1

It sounds like a typical DOS attack. They are probably hoping to get the server to respond to a request from itself and hoping to get a loop like the "ping of death". It's also a convenient way to spoof to get around some firewall rules and cause general headaches. Blocking the external IP at the firewall is probably the best bet so that they can't get the requests in the door.

AJ Henderson
  • 41,816
  • 5
  • 63
  • 110
  • 1
    How do you generate a valid HTTP request to a machine using its own IP address without source routing? – Mantriur Feb 12 '13 at 20:05
  • You form a packet that says it is from the machine. The Internet trusts you to not do "bad stuff". Since it relays packets freely, you can simply say that you are passing along a packet from whoever you feel like and it will trust that you actually got the packet from there. It doesn't matter if you just created a new packet with the IP of your victim. This is how a lot of DOS amplification attacks work as well, since any service that takes a small request and produces a large response can have the response sent to the victim's connection. – AJ Henderson Feb 12 '13 at 20:07
  • 3
    Yes, but to finish the TCP handshake and the HTTP request, the other end actually needs to receive the responses. But the initial SYN would already be routed to the servers own interface (no source routing). I'm not really up to speed with modern methods there, but to my knowledge without source routing there is no way to receive a response to a packet with a forged sender address. – Mantriur Feb 12 '13 at 20:12
  • 1
    @Mantriur - my understanding of it is that the attack doesn't really care about getting the response, but rather the attempt to respond is what they are looking for to occur. TCP makes it a bit harder since they won't get the response, but if the sequence number can be guessed, they can simply send a response to the handshake without needing the response. (Atleast that is my understanding. I could be wrong on that as I've only done brief reading on the details of the subject.) – AJ Henderson Feb 12 '13 at 20:45
  • Sequence prediction would be a possibility. But it seems weird that someone would go through that trouble just to spoof the servers own address and then stop when that address gets blocked. Feels too professional for someone who is then stopped so easily. :) – Mantriur Feb 12 '13 at 21:06
  • 1
    @Mantriur - Fair point, but it could also just be someone using a tool they don't understand. Hard to tell. The better question is what else makes sense for the situation? – AJ Henderson Feb 12 '13 at 21:29
  • Good point, but that's really a lot for someone who just happened to find a tool on the internet to do his bidding. Now that server runs a social media site that has its share of problems with weirdos and the ban mechanism is quite sophisticated, so I'm sure there's no lack of people with more motivation than knowledge. I'm going to implement Isernis idea and report back. :-) – Mantriur Feb 12 '13 at 23:22
0

It sounds like a DDoS attack which is spoofing the IP of the server. The best action would be to put a packet filter on the external router rather than using firewall rules, as using the router will reduce load on the firewall. On a cisco router the simple solution would be to write an access list with the source being your public block, and destination being any, then applying it to the external interface as an "ip access group in".

Rate limiting ICMP would be a good idea too, they may try to ping flood you.

You should give some thought to rate limiting valid traffic as well, the next DDoS attack won't be spoofing your own IPs, they'll be using valid IP addresses that you cannot filter without filtering out your customers. A well chosen rate limit will keep your server from running out of resources.

GdD
  • 17,291
  • 2
  • 41
  • 63
  • 3
    Can someone really spoof a TCP connection by using *the victim's source IP*? The server would try to ACK itself and promptly RST the connection as bogus. – LSerni Feb 12 '13 at 22:10