I'm a 100% sure that I'm not the first one to consider benchmarking a whole infrastructure, but yet I haven't found any relevant information on how to face this challenge.

As I guess we all know what we are talking about, I will try to draw the scenario as short as possible.

I've been working for years in the hosting business and we've always done some kind of testing for new stuff. Back in the Wonder Years, we did some "ab", "dd" or "bonnie", etc. to test disk, cpu and so on.

Then we grew up and we needed to benchmark our commercial website or some big customer's website. We tried a lot of tools and we ended up with things like Jmeter, which was a great help for quite a while. Recently, we've been using Locust, which is a great tool, not so easy to setup, but very powerful.

But now we can say "we are mature", we sell cloud and many more things than "a bunch of websites on a webserver" are at stake.

As a cloud engineer you design a storage solution that will be able to host thousands of vms (The same scenario and needs apply to other things like big Database clusters, for example). You spend days doing your maths and propose a number of available vms for the budget you are assigned. We all know what comes next... someone comes and says... no way... we have to fit at least twice as much in there, we have to make money!!

So... you know that there is no possible way to fit twice the number of vms you calculated, but you have to open a negotiation and provide enough information to the managers in order to come to an agreement, somewhere in between.

And here's where the problem lies... how am I supposed to test a whole infrastructure that can host thousands of vms? We've already worked with load distributed testing tools, such as Jmeter or Locust. They are great, but have one big issue: They were designed to test one ip address, not thousands of vms.

So... I guess that many have come to this situation only to realise that there is no way to test that effectively. However, I'm sure you guys have at some point found a way to test an infrastructure like this in a more realistic way than performing tests the old way. I would appreciate any idea you can give me.

Obviously you need a proper architecture and setup, nice hardware, daily maintenance, and much more. Everything we can possibly do to get our system clean and updated is already being done, but... at what point should we stop putting data in?

What we are doing right when preparing a new system is:

  • create a nagios/munin system where you monitor the main stuff: network, disk latencies, etc.
  • create hundreds/thousands of vms, depending on the TBs available.
  • launch all or most of those vms (some are only used to occupy the space).
  • ssh into most of them and perform at once or intermittently some type of disk test like dd, bonnie or iozone.
  • start browsing "manually" some websites hosted on those vms and decide if they are slow. Obviously this is a very subjective matter. Despite that, we can say that most people feel "happy" if the web loads in a less than a second.

sometimes, by just looking at the munin graphs, you can see some possible bottlenecks, but we've had degradations of the service with a lot of less active vms than the warning threshold we managed to identify during the tests.

So, to sum up, I know that if someone had came out with a solution for this issue, it would be very simple to find on the first page of Google, but let's see if someone has strategies to properly benchmark some small parts of the system.


  • Good question, but I want to note that you can't just calculate that X number of VM's are going to use X number of IOPS each. A single VM (without resource control) could easily eat up all your IOPS.. – pauska Feb 03 '14 at 18:16

1 Answers1


I'm not an expert, but have been reading about cloud benchmarking today, especially a report from SPEC Open Systems Group: "Report on Cloud Computing to the OSG Steering Committee". The CBTOOL open-source from IBM seems like it would be useful for you. "Cloud Rapid Experimentation and Analysis Tool (aka CBTOOL) is a framework that automates IaaS cloud benchmarking through the running of controlled experiments." Looks like it specifically tests VM provisioning on various cloud platforms.

Steve Koch
  • 101
  • 1