There are a couple of things I want to establish first before I give you general advice for spotting common characteristics of cross-site scripting (XSS) probing attempts in your logs.
- I am assuming you will be manually inspecting your logs;
- Without loss of generality, let's also assume you are able to keep up with the number of entries in your logs. So there are not thousands of requests hitting your server at any given moment.
These points should make it easier for me to describe the process by which you can spot XSS payloads in your logs.
The first issue I would like to tackle is trying to reduce the common characteristics down to a small handful of payloads. XSS is such an immensely large group of attack vectors and scenarios that even renowned security researchers often struggle to agree what term is appropriate to describe a particular type of XSS vulnerability. This can actually be seen in your question, you are trying to reduce XSS down to three specific payloads (<script>
, <img>
, and <iframe>
). If you pick any of the many XSS-payload lists that you can find out there, you will quickly notice that there is a vast number of payloads (e.g., this payload.txt list).
On top of that, you only listed two categories of XSS vulnerabilities.
There are two types of XSS i.e. reflected and stored.
There are way more than just two types. For example, self-XSS and DOM-based XSS are not included in your list. These two missing types will be a particularly interesting challenge and I will go into detail later on as to why this is the case.
Types of payloads and contexts
So now that we have established that it is very difficult to reduce XSS vectors down to a small list, let's take a look at some common contexts where XSS vulnerabilities arise. The goal here is to demonstrate how from a very small list of contexts one can construct a variety of payloads with very little to no common characteristics.
The most notable contexts are HTML-based and JavaScript-context (once again, I am simplifying things here, there are many more contexts). Assuming there is no firewall or filter interfering with the payload, these aforementioned contexts will usually require certain characters in the payload to break out of the context or run client-side code within the context. To better explain some of these cases, I will refer to @Filedescriptor's XSS polyglot challenge list which can be found here. Do not worry I won't go through every single one, I plan to just pick out a couple to demonstrate how one constructs XSS payloads based on the context.
<div class="{{payload}}"></div>
<div class='{{payload}}'></div>
<title>{{payload}}</title>
<textarea>{{payload}}</textarea>
<style>{{payload}}</style>
<noscript>{{payload}}</noscript>
<noembed>{{payload}}</noembed>
<template>{{payload}}</template>
<frameset>{{payload}}</frameset>
<select><option>{{payload}}</option></select>
<script type="text/template">{{payload}}</script>
<!--{{payload}}-->
<iframe src="{{payload}}"></iframe> (" → )
<iframe srcdoc="{{payload}}"></iframe> (" → < → )
<script>"{{payload}}"</script> (</script → <\/script)
<script>'{{payload}}'</script> (</script → <\/script)
<script>`{{payload}}`</script> (</script → <\/script)
<script>//{{payload}}</script> (</script → <\/script)
<script>/*{{payload}}*/</script> (</script → <\/script)
<script>"{{payload}}"</script> (</script → <\/script " → \")
My resulting payload which covers all the contexts above was:
javascript:"/*\"/*`/*' /*</template></textarea></noembed></noscript></title></style></script>--><svg onload=/*<html/*/onmouseover=alert()//>
This payload is known as a polyglot; i.e. the payload covers multiple contexts at once. To break out of the first context (<div class="{{payload}}"></div>
), I had to use double quotes. "><svg onload=alert(1)>
alone would have worked in this context.
Next, let's pick the second-to-last case: <script>/*{{payload}}*/</script>
. To keep things simple, we will ignore the filter that was implemented in the challenge. This is a JavaScript-context and */alert(1)/*
would break out of the multi-line comment.
For the last case that we will look at, I would like to include the filter. <iframe srcdoc="{{payload}}"></iframe> (" → < → )
replaces double quotes and the <
character. To bypass this filter, one can simply HTML encode the <
character: <img/src=x onerror=alert(1)>
. This results in <iframe srcdoc="<img/src=x onerror=alert(1)>"></iframe>
. This case did not require breaking out of the context.
Notice how with just three contexts alone I was able to show three very different payloads. So before attempting to use your logs to find XSS payloads, make sure to familiarise yourself with a few common contexts.
+-----------------------------------------------------+----------------------------------+
| Context | Example payload |
+-----------------------------------------------------+----------------------------------+
| <div class="{{payload}}"></div> | "><svg onload=alert(1)> |
+-----------------------------------------------------+----------------------------------+
| <script>/*{{payload}}*/</script> | */alert(1)/* |
+-----------------------------------------------------+----------------------------------+
| <iframe srcdoc="{{payload}}"></iframe> (" → < → )` | <img/src=x onerror=alert(1)> |
+-----------------------------------------------------+----------------------------------+
Probing from an attacker's perspective
This section covers how I as an adversary (more precisely, my personal experience as a bug bounty hunter) might go about probing for XSS vulnerabilities in your application and what this would look like in your logs.
Most notable bug bounty hunters that I can think of use a very basic probing vector for manually determining if user-input is reflected anywhere. So we might use something along the lines of '">foobar
or '"><u>foobar
to quickly gather various endpoints that reflect these payloads. Note that the foobar
bit is actually quite important. We want to be able to quickly search for our payload in the source code, so hunters like to use unique strings in their payload. From your perspective, this means we leave a trail of fingerprints that you can follow to see where we are testing for XSS vulnerabilities.
This brings me to the next probing characteristic, you will very rarely see someone testing a single endpoint once and then giving up. If an adversary is scanning or manually testing for XSS vulnerabilities, they will usually be very persistent and you should see a long series of consequent XSS payloads popping up in your logs.
In addition to all of the probing characteristics listed above, there is one last and very important one that must be mentioned, the use of JavaScript methods such as alert
, prompt
, and confirm
. I am sure you have come across these before while reading up about XSS. When testing for XSS vulnerabilities, one might want to get an immediate indication of a vulnerability and the easiest way of doing this is to get a big prompt fire right in front of your face. It becomes immediately obvious that you have an XSS vulnerability when the modal shows up. Also, the rush you get from the payload firing in that way never gets old. :)
You could grep your logs for keywords such as alert
, prompt
, and confirm
. That being said, this is definitely not foolproof since, depending on the context, it may be possible to adjust the payload and mask this keyword — this is commonly seen when attempting to evade web-application firewalls and filters.
The problems with spotting self-XSS and DOM-based payloads
The issue can be quite simply summarised as: The payloads do not always show up in your logs. Take this example of a DOM-based XSS vulnerability:
<!-- test.html -->
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width">
<title>DOM-based XSS example</title>
</head>
<body>
<script>
// Fetch the redirect parameter
redirect = window.location.hash.substring(1);
// URL-decode the value
redirect = decodeURIComponent(redirect);
if (redirect !== 'UNDEFINED' && redirect !== "") {
// Redirect to the value
location.href = redirect;
}
</script>
</body>
</html>
When navigating to http://example.com/test.html#javascript:alert(1)
the payload should display an alert
box, but when I check my logs all I see is the path.
GET /test.html HTTP/1.1
Hopefully, this example better illustrates some of the issues you might face when trying to spot certain XSS payloads in your logs.
The best advice I can give you for tackling this problem is to implement a Content Security Policy (CSP) with the report-to
directive. Now whenever the CSP detects a policy violation, it will notify you at the report-to
endpoint (see https://report-uri.com/ for a service that logs these errors via the report-to
directive). You can also just implement the Content-Security-Policy-Report-Only
header if you are only interested in logging errors.
A small idea for getting a feel for manually spotting XSS payloads in your logs
Here is a fun idea that might help you see all of what I have described above in action. Build a small XSS Capture the Flag competition (CTF) and get a group of friends that are security-oriented to try to find the XSS vulnerability in your application. During this time, take a look at your logs. Then do the same thing but with a scanner such as Burp Suite. After a while, you should get a feel for what basic XSS probing (for a certain number of XSS vectors) looks like in your logs.