1

I'm maintaining some web crawlers. I want to improve our load/throttling system to be more intelligent.

Of cause I look at response codes, and throttle up or down based on that. I would though like the system to be better at dynamically adjusting the rates based on the behaviour of the server that's being requested. Let's say it's a very busy time of day, and the target webserver is experiencing an unusual high amount of traffic or something else. Then I would like to detect it, throttle down request from my side to be polite, and throttle back up when the server is ok again.

What would symptoms be, that I should conclude as indicators to throttle down? And what would be my indicators to throttle up again?

I'm been thinking about recording the response time for each request for e.g. the last hour. The problem is, that it's extremely hard to find a reasonable average/median request time to benchmark against because all servers are different and even resources within same website respond with very different speed. Another thing I'v been thinking about is looking for fluctuations in response time, but I don't know if it's a common symptom or it's more common that all requests just takes longer.

Niels Kristian
  • 358
  • 2
  • 13
  • 4
    look for CHANGES in the response time. in real life scenarios you don't know it was you but slowing down even if not is probably a good idea – Skaperen Mar 05 '15 at 11:25

0 Answers0