1

I am developing multiple simple applications in Django framework (you can imagine them as more complex blogs with various features). All the applications have a lot of similar features - mainly the customized CMS. I can create them as one project in Django with multihost middleware to handle the requests from various domains. The expected number of apps 10-1000 and expected hit rate for app can be 100/10k per day.

My idea is to run this project (all applications) on AWS or GCP.

Django is pretty flexible, I can use the settup what I use now - multihost middleware, or I can customize the settings a little bit for every app - so the code that is same for all apps will be in one place, however every app will have its own instance. Both options are possible and they have similar level of difficulty from coding point of view.

So my question is, should I:

  1. create one instance of the project with all the apps (the main app will detect the domain and handle request),
  2. or should I create one instance for every app and handle the domain redirects in "Apache level",
  3. or should I create new instance of "hosting" (virtual machine? whatever they call it) for every app?

What is better scalable? What is safer? What will be cheaper? What is generally better?

matousc
  • 133
  • 1
  • 6
  • That depends on your environment, how many resources your app needs etc. You have to test this yourself. – Sven Jun 10 '18 at 08:20
  • @Sven The only thing which is opinion based about this question is the last sentence asking which approach is better. The questions about scalability, safety, and price aren't opinion based an can be answered objectively. – kasperd Jun 10 '18 at 08:43
  • @kasperd: I kind of disagree, but I also see this as a "capacity planning" duplicate. Anyway, I've reopened it. – Sven Jun 10 '18 at 08:47

1 Answers1

1

In terms of scalability the question boils down to whether the individual application can be scaled horizontally. If a Django application isn't keeping any state which needs to be preserved between requests, then you can simply create as many replicas of your VM as you need to serve the users.

If none of your Django applications is keeping such state inside of the application then you can scale up by adding more VMs regardless of whether the VMs themselves host one or many applications.

If any of the applications do need data across requests and cannot easily be scaled horizontally across more than a single VM, then the best you can do initially is to provision those specific applications with a single VM per application. That way they won't jeopardize the scalability of the others.

It could be that some of your Django applications are using a database as backend for storing data, in which case the same question will apply to the database as well. Horizontal scaling at the database layer is going to be harder because on the database layer you definitely have persistent data you need to be replicated across your instances.

I would recommend a separate database per application unless you have some specific data you will need to be shared between the applications. You should also investigate whether the cloud provider offers a database service which covers your need such that you won't have to administrate your own database.

In terms of security it is generally considered better to have more isolation. So from that perspective you should choose separate VMs per application.

In terms of flexibility there is also an advantage to separate VMs per application. Maybe you'll find you need a specific configuration of your VMs to support one of the applications and another configuration for another application. If you use separate VMs you have the flexibility to do that if you need to.

So all of the arguments so far are in favor of separate VMs for each application. So now let's take a look at what it costs.

For redundancy I'd recommend that you have at least 3 VMs in each pool. With a pool for each of your 1000 applications that's 3000 VMs. On top of that you need to provision enough VMs for any of the applications which actually receive enough traffic to need more than a single VM to handle all of the traffic.

If instead you went with a single pool for all of your applications you just need to provision enough VMs to handle all of the traffic plus a small number for redundancy.

What this means is that you are going to pay the cost of 2000-3000 additional VMs for the benefits listed above. Whether that's a reasonable price is for you to decide. Before you make that decision I recommend you test how many VMs you would handle the load in the first place. Whether that number turns out to be 100 or 10000 is going to make a lot of difference to whether you want to spend another 3000 VMs.

kasperd
  • 29,894
  • 16
  • 72
  • 122