5

I have an architecture question. In a clustered web app environment, I can think of three ways to deal with background jobs:

  1. have a dedicated machine run all the jobs, thus freeing the web servers from having to do so
  2. have each web server also run background jobs, using a mechanism to make sure no two machines kick off the same job
  3. have one of the web servers double up as jobs-runner

What's the preferred approach?

6 Answers6

1

IANAExpert, but I would imagine that option 1 would be preferable. The reasoning behind this is a simple separation of concerns. If jobs have their own dedicated machine, you can manage growth better. If you use option 2, you'll have an job processing potential that doesn't match its requirements. While the resources used should be the same whether one machine or many are running the jobs, I imagine whatever queuing system you're using has some overhead. Also, if something goes wrong with the queue or the webserver, you won't bring the other down. You've silo'd each part of your application, so you can grow as necessary, not as your architecture demands.

Paddy
  • 136
  • 5
1

Each option has pros and cons and to select the preferred way in any case is needed (imho) a bit more of information. For example, what sort of background jobs? This is a crucial question, because if, for example, are business process might be interesting take advantage of the already present cluster.

If are, for example, maintenance processes not directly related with business (or users needs) may be have more sense to have a separated hardware (or virtual).

In my experience, sometimes, all us are a bit reluctant of full use the cluster, but the clusters are on place to use them!

0

If you have the resources, and it doesn't matter to the background tasks where they're run from, I'd go for option 1.

There's no reasoning for it, other than why burden your web servers if you don't have to.

Jak S
  • 101
  • 1
0

It depends upon your budget, but the most desirable approach is to run jobs on separate machines from those serving web content. It gives good separation of concerns and you don't have to worry about the web experience being affected by a heavy job being run.

jamie
  • 115
  • 3
0

Peldi, consider using an approach which would allow to have a single job queue (preferably in the database), and one or multiple job runners. This way, you can run either one or several job workers on one or different machines - and this will make your configuration flexible.

I don't know what kind of tasks you're going to run and which technologies will to use, but in Ruby/Rails world such task can be solved using delayed_job

Some other information about background processing is available at http://en.wikipedia.org/wiki/Job_scheduler

Personally, in my project, I run background jobs on the same machine where database resides, but I can add more workers/machines later if there will be a need.

Hope, this helps :)

KIR
  • 101
  • 2
0

Faced with the same situation you're in:

I wouldn't go for option 1 as you have a single point of failure. or additional architecture work to derisk that failure

I wouldn't go for option 3 as you'll end up with non-identical webnodes and that will hinder any future automation.

I'd use option 2 and have a central queue service, preferably a cloud based one as it'll be already clustered thus saving you the burden of failover, scale etc

I'm assuming you've already dealt with failover and load by using a cluster with your webnodes, so I would simply add to the workload of those horses, but running in a seperate process.

Hope that helps

Xolv.io
  • 101
  • 1