29

Recently I used a tool to download a website and as part of the tool one could adjust the number of parallel connections. So now I found myself asking: starting from how many requests a provider could rate it as a denial of service. I googled around however didn't find specific numbers or at least hints about what dimensions we are talking about. Is there any definition e.g. like 100 requests a second?

So my question is: How many requests are needed to state that a denial of service is in progress?

Update: the technical background is definitely of interest. I understand that one malicious packet could be enough to cause a denial of service or the slashdot effect is another. But what I wanted to know was more of a firewall style rule: Some servers / service providers block out users which send too many requests in a certain time frame. About what dimension are we talking about here? Or is that too specific? If so what would your rule look like?

The question also had a legal component - let me illustrate a high(!) theoretical scenario:

A provider of a service checks its logs and sees that there has been high traffic from a single IP. Now the provider goes to court (for whatever reason) and labels this as an attempted denial of service. The judge would probably ask for their definition of a DoS. "Anything beyond the normal usage" would be their answer. So where is the threshold between normal usage and "none" normal usage (which could be interpreted as an attempted DoS even if the server remains totally unimpressed and this is probably a highly constructed scenario ;-)

Lonzak
  • 413
  • 1
  • 4
  • 8
  • 2
    This will ultimately be a function of the available bandwidth - N connections are hogging bandwidth for malicious requests at the expense of being able to serve real requests, it's a DoS. Since DoS is often DDoS, the trick is knowing whether you're under attack, or just unexpectedly popular. – Phil Lello Mar 23 '16 at 14:23
  • 12
    It's worth noting that not all denial of service is an attack. Servers crashing under unusually heavy load after being linked to from a popular site is a common enough phenomenon that it has [its own Wikipedia page.](https://en.wikipedia.org/wiki/Slashdot_effect) – Mason Wheeler Mar 23 '16 at 14:44
  • Also, see [what happened to Twitter and Google the day Michael Jackson died](http://edition.cnn.com/2009/TECH/06/26/michael.jackson.internet/). – Zenadix Mar 23 '16 at 15:04
  • @MasonWheeler DoS implies deliberate attack though, rather than a service being unable to cope with demand. – Phil Lello Mar 23 '16 at 15:48
  • 1
    Are you asking this from a technical or a legal standpoint? The answers might be different. – Anders Mar 23 '16 at 16:15
  • 7
    To emphasize bandwidth and resources, I was at a store recently and the customer in front of me knew the checkout person and proceeded to have a 5 minute conversation. So just one person caused a denial of service from my perspective. – Chris Haas Mar 23 '16 at 16:21
  • 1
    @cgcampbell It's not uncommon for words and phrases to locally change meaning once users overhear engineers. There's usually little benefit trying to reclaim a term at that point. – Phil Lello Mar 23 '16 at 19:03
  • Note that even if a provider doesn't consider it a denial of service they may still consider it to be abuse. IMO it's at the very least highly rude to use more than a handful of connections to a given server at a time without explicit permission. – Peter Green Mar 23 '16 at 20:28
  • Exactly! And I want to know where the threshold between 'everything is fine' and 'highly rude' is. Maybe abuse is a better term than DoS which implicates a malicious intent. – Lonzak Mar 23 '16 at 21:27
  • 1
    @CGCampbell - well in that case you are completely misusing the term in your company. DoS is a type of an *attack*. Just because you have a shit server that can't handle its traffic doesn't mean it's being DoSed. – Davor Mar 23 '16 at 21:54
  • 4
    @Davor DoS is an observable effect. DoS attack is an attack that attempts to cause that observable effect. That is, an "attack" has intent, unless you want to introduce such terms as "accidental attack by neglect" etc. – Eugene Ryabtsev Mar 24 '16 at 05:13

4 Answers4

46

Enough to cause the service to be denied to someone. Might be 1 unexpected malicious request, which causes excessive load on the server. Might be several million expected requests, from a TV advert with a really good response.

There isn't a specific value, since all servers will fail at different levels - serving static content is a lot easier on the server than generating highly customised content for each user, so generally authenticated services will have a lower "problem" threshold than unauthenticated ones. Servers sending the same file to multiple users may be able to handle more traffic than servers sending distinct files to multiple users, since they can hold the file in memory. A server with a fast connection to the internet will usually be able to handle more traffic than one with a slow connection, but the distinction might be less dependent on that if the generated traffic is CPU bound.

I've seen systems that fail at 3 requests per second. I've also seen systems which handle everything up to 30,000 requests per second without breaking a sweat. What would be a DoS to the first, would be a low traffic period to the second...

Updated to respond to update

How do firewall providers determine when traffic is causing a denial of service?

Usually, they watch for response times from the server, and throttle traffic if they go above a pre-set limit (this can be decided on a technical basis, or on a marketing basis - waiting x seconds causes people to leave), or if the server responses change from successful (200) to server failure (50x).

What is a legal definition of "denial of service"?

Same as the original one I gave - it's not denial of service if service has not been denied. It might be abusive, but that wouldn't be quite the same thing.

Matthew
  • 27,233
  • 7
  • 87
  • 101
  • Is there a general rule of thumb for how much traffic defines a DoS, relative to "normal" busy traffic levels? E.g. twice the usual level of traffic, 5x, etc? I think that's what @Lonzak was asking. – Philip Rowlands Mar 23 '16 at 13:56
  • 5
    @PhilipRowlands Well, no, not really. If a system can handle twice normal load, but everyone who is trying to use it is still able to, at a reasonable speed, with reasonable responsiveness, it's not a denial of service. By definition, service has not been denied! You can generally avoid suffering DoS by spending more money on infrastructure, but then you may well be overscoped for "normal" traffic. – Matthew Mar 23 '16 at 14:05
  • 2
    The 1 request example is good. A server can be killed by triggering a huge DB query, for example. That generally reflects poor application architecture, but could just be not keeping up with growing or poorly indexed datasets. – Phil Lello Mar 23 '16 at 19:08
  • @PhilLello Your proof: at a former job, users could run reports that resulted in what we called "brownouts": *One* running report would result in the DB server queuing all queries that needed those same tables until the report was finished (which would take 3-5 minutes *if* we were lucky). Other aspects of the site, including database-backed content that didn't need the tables as this report, would continue to be highly responsive. The term "brownout" was coined by a manager who didn't want to admit the reality: A **single** (totally legitimate) request would result in a DoS. – Kromey Mar 23 '16 at 23:53
6

When you download (or scrape) a website, you are basically sending a lot of GET requests for each URL in the target website.

This is an example of GET request from the World Wide Web Consortium website:

GET /pub/WWW/TheProject.html HTTP/1.1

Host: www.w3.org

As you can see, the main issue is not the request but the response of the webserver, which sends you the whole resource identified by the given URL.

Therefore, we can say that

Max number of requests per second = (Factor of Safety * Bandwidth)/ Max size of a webpage

Judging from a quick Google search, the average size of a webpage is about 2 Mb, and the bandwidth of a web server can range from a few Mbps to a few Tbps.

The factor of safety is related to the fact that, in order to cause a DoS attack, you may not need to send a number of requests corresponding to the 100% of the bandwidth. For example, if the webserver has a 100 Mbps bandwidth, and 50% of it is used at a given instant for other users, it is enough to send a number of requests corresponding to 50%, or even a smaller percentage, of the bandwidth.

50% of 100 Mbps = 50 Mbps, which corresponds to 25 average GET requests per second.

On the other hand, if no one else is visiting the website, you would need to use at least 80% of the bandwidth in order to cause a DoS, and 80% of 100 Mbps = 80 Mbps, which corresponds to 40 GET requests per second.

Clearly, in order to (unintentionally) DoS a huge website having a bandwidth of 1 Tbps, you would need to send at least (80% of 1 Tbps)/2 Mbps =400,000 GET requests per second. And so on.

In order to have a more accurate measurement, you would need to find the maximum size of a webpage in the target website and its bandwidth.

Warning: since you could potentially get in trouble for causing a denial of service, it is better to round down the number of request per second you obtain through the previous formula.

A. Darwin
  • 3,562
  • 2
  • 15
  • 26
5

I debated about making this an answer, it might be better as a comment.

Lets take a look at your question from both angles.

From the host

Something becomes a DoS when the about of traffic, or what that traffic is doing, causes the server to be unavailable for others. A few examples;

  • Running a long running report 500 times
  • smashing refresh really fast on a web site that can't handle it
  • using your larger bandwidth to fill their upload pipe so full others loose speed.
  • scraping the website in a way that causes there host to be non-responsive to others.

All these examples are possible, but not likely. When we talk about a DoS attack we are talking about one person/client doing all this, and most web servers are set to handle hundreds or thousands of requests at the same time. That's why DDoS is so popular. Because it takes more then one client to overload a normal server (under normal circumstances).

To add complication, many clients may start using your site for the first time after some marketing. Sometimes it's not even your marketing that triggers it. For example a popular cell phone release may cause a spike in traffic on your how to site. It can be very difficult to tell DDoS traffic from legit traffic.

There are a few ground rules though. What your basically looking for is abnormal usage.

  • Are there users that are downloading way more then others?
  • Are there users that are staying connected way longer then others?
  • Are there users that are re-connecting way more then others?

These guides and others, can help you figure out what traffic is part of a DDoS attack and apply some kind of filter.

From the User's POV

When deciding to scrape a website you should check first and see if they have a policy. Some sites do, and some do not. Some sites will consider it theft, and others not. If a site does not have a policy then you have to make your own call.

Your goal, if they do not have a stated policy, is to clearly stated that your scraping (don't mask the user agent or header that your tool might be using), and to try to have a little impact as possible. Depending on your need for scraping, can you scrape just a few pages or do you really need the entire site? Can you scrape at a "normal user" rate, maybe 1 page every 5 seconds or so (including media content)? If you want to capture the data fast can you just capture the text files and not capture the images and other media? Can you exclude long running queries, and larger media files.

You over all goal here it to be respectful of the hosts cost of hosting, and the other users of the site. Slower is usually better in this case. If possible contact the website owner and ask them. And no matter what, follow the rules in the robots.txt file. It can have a rate limit and page limits that you should follow.

coteyr
  • 1,506
  • 8
  • 12
  • Thank you for your great answer and for covering the technical and legal aspects. "apply some kind of filter." This is what I am looking for - where would a threshold of that filter be? And on the legal side - what is respectful and what is misuse of a service? – Lonzak Mar 23 '16 at 21:37
  • There are no rules for "some kind of filter" they are designed as needed on a case by case basis. – coteyr Mar 23 '16 at 21:48
4

To build on Matthew's answer in respect to Philip Rowlands comment:

The general rule of thumb for defining DoS traffic is context.

If you just ran a TV ad at the superbowl, you can assume the ensuing flood of traffic is contextually non-malicious (whether it causes a service outage is irrelevant).

Where as, if it's just another Tuesday morning, and your site it flooded with requests for no identifiable reason, it would be safe to assume the traffic is malicious (or at least suspicious, eg. unknown reddit post as opposed to a targeted attack).

Monty Harder
  • 476
  • 3
  • 6
WorseDoughnut
  • 761
  • 5
  • 18
  • 3
    No, it's not about context, it's about intent. Your examples use context to deduce intent. – symcbean Mar 23 '16 at 21:37
  • @symcbean No, my examples use context to generate a proper reaction. There no way to figure out the intent of the clients connecting to your site en mass, the best you can do is figure out if it's something that should be happening or not, and react accordingly. – WorseDoughnut Mar 24 '16 at 01:01