8

I've been running a Debian server since 2018 and recently I've noticed that it's functions have been expanded to the point that it's become quite "monolithic".

This server runs an instance of Nextcloud, Gitea, Roundcube Webmail, YOURLS and two WordPress sites - plus some other things that don't use SQL.

I also have a separate (but much smaller) server on my network that's not accessible from the internet (LAN only) and I've been considering offloading the SQL server to it.

Are there any advantages and/or disadvantages to this?

Server Fault: This is not an opinion question. Things like latency, implementation of SQL caching, cybersecurity implications and many other things are measurable metrics.

  • How comfortable are you with added complexity? The packaging of each of those tools may assume the DB is local, and need a lot of twiddling to push updates. – Criggie Sep 17 '22 at 22:30
  • Not an expert in performance tuning of databases, but are you/have you considered deploying those services as containers? Generally, each service would have it's own database container as part of the compose/swarm/kubernetes configuration. I would consider this isolation and separation of higher importance here. But others can weigh in. – GCon Sep 18 '22 at 05:10
  • @GCon Unfortunately, that's not going to work in this circumstance. Tried Docker before, but it binds itself to port 80 and 443 meaning each different site has to use other ports, and `https://example.com:3443` doesn't look particularly professional. – Brandon Powell Sep 18 '22 at 11:28
  • 1
    @Criggie Luckily, WordPress, Nextcloud, YOURLS and I think Gitea all have configuration options for external SQL server hosts - so the added complexity wouldn't be very severe. – Brandon Powell Sep 18 '22 at 11:30
  • 1
    @BrandonPowell - pont taken. But you should be using a reverse proxy (e.g. nginx) in front of the individual services. This will allow you to host as many services as you want, listening on 80,443, possibly with SSL termination for each service all in one proxy. You will need to differentiate between the services by using a server name/subdomain (e.g. gitea.example.com or example.com/gitea - you can do both with nginx). As an added bonus, you get logging and other advantages with this approach. – GCon Sep 18 '22 at 17:52

2 Answers2

9

As usual the only way to see exactly how your environment behaves is to try it within the constraints of your environment. In short, latencies will increase when having to cross a network link compared to connecting to local sockets. This may or may not be compensated by having more resources available to the database engine depending on other server load.

In a small business environment with low transaction rates and few concurrent users it would surprise me if you’d see a lot of improvement by separating the database to its own host unless the machine where it currently resides is struggling. Moving it to a significantly weaker server may even be counterproductive.

Mikael H
  • 4,868
  • 2
  • 8
  • 15
  • 1
    Thanks for responding. I don't know that there'll be a significant latency addition by offloading SQL because I have a gigabit-capable network and both servers have a direct wired connection straight to the router. The only caveat is that, interestingly, my bigger server (PowerEdge T40) only has a 100mb/s ethernet port - but that can be solved easily enough. The T40 doesn't struggle in most cases, but Nextcloud in particular makes a lot of DB queries per page loaded, so it spends a noticeable amount of time waiting on query results. – Brandon Powell Sep 17 '22 at 12:45
  • 2
    @BrandonPowell gig NICs are cheap and common - aim for an intel chipset and avoid anything Realtek or Broadcom if you can. If you have a managed switch, consider adding two NICs in the server to allow for a bonded interface later. – Criggie Sep 17 '22 at 22:33
  • 4
    @BrandonPowell: IDK if you're confusing bandwidth and latency. Round-trip ping time on localhost is an order of magnitude or two lower than to another host on a LAN one hop away on an unmanaged gigabit switch. Like 0.013 ms vs. 0.26 ms for me, a factor of 20. The absolute latency might still be low enough not to be a factor in human-perceptible terms, though. If a page load takes 100 queries, and they're dependent (the code making the query waits for the result of one before it figures out what to ask next, or in practice before it *does* ask again), that's 26 ms of network round trips. – Peter Cordes Sep 18 '22 at 05:36
  • @BrandonPowell: Of course, if the code making queries can fire off multiple queries before waiting to collect the result of some of them, multiple can be in the pipeline. Then it's mostly a bandwidth and throughput concern, not latency. But serial dependencies in the logic, and/or synchronous code in the implementation, can make latency matter by serializing the round-trip latencies, not overlapping them. – Peter Cordes Sep 18 '22 at 05:38
  • @PeterCordes Sorry for the confusion. My intention was to convey that bandwidth is not a problem on this network, and that latency is as low as it can practically be (direct to the router, no *external* switch. A few ms of added latency probably wouldn't make much difference, especially if something like an SQL Cache on the separate server can reduce the time taken for an actual query to be answered. I tried SQL Caching on my Dell Server but for some reason that I admittedly don't fully understand, it didn't play nicely with the multiple SQL-reliant sites. – Brandon Powell Sep 18 '22 at 11:26
  • Latency on a local network is not a few ms - it is in the parts of a ms level and a pro level switch is below 100 USD. Are you SURE that the T40 only has 100mbit? Because seriously, I can not REMEMER seeing a 100mbit port on even a consumer motherboard. And that means going back to the time the EURO was put into existence. Either Dell really sells crap these days or... this must be such an ancient machine your phone would run circles on it. The 100mbit port sounds AWFULL off, just do not exist for a LONG time. Also bandwidth - 10gbit is CHEAP – TomTom Sep 18 '22 at 16:33
4

This response is not specific to WordPress, but applies equally to all web applications

The common benefits to offloading the database (or any component) is that you can manage scale, redundancy and security of that resource independently of the other resources.

In most implementations we are not offloading the database, but instead we are offloading the web or application host. This is because it is traditionally harder to manage horizontal scaling and synchronizing the data between a cluster of database servers, so we scale database servers vertically first (increase resources) but web applications and web hosting infrastructure is designed to support horizontal / scale-out first and is generally more economical to do so.

Many enterprise RDBMS engines achieve optimization through aggressively pre-allocating or reserving memory, this will reduce the memory available to the web hosting infrastructure. Single instance databases can usually handle a significantly higher magnitude of users than a single instance web front-end (WFE). At the point where you run out of resources if you only scale up (by increasing resources) you may find that the database engine is underutilized (due to the pre-allocation) and at a point you stop noticing improvements in the WFE performance. At that point you have no other choice than to scale out horizontally.

There is often licensing concerns around scale-out, many enterprise database licenses are charged per CPU, sometimes per core. It can be cost prohibitive simply on a license basis only, disregarding setup and maintenance costs to increase the database instance or core count if it is being under utilized.

The licensing argument goes the other way too, if your workload is quite small, then you can reduce costs by offloading your database to an existing database server or at least by pooling multiple databases on the same server.

There is also the security factor, we gain an extra level of security by not hosting the database server on the edge of the firewall with the Web Front End, but instead the database server should be fully protected within the firewall.

Although it is possible to run all processes on the same server, there is considerable RISK in doing so, I would not recommend single server deployments unless it was for non-critical, internal usage scenarios that have a low number of active concurrent users.

Same server deployments are harder to reinstall/configure after failure, you can restore the entire server image (if you have one) or you can try to reinstall and configure manually, however this can be a tedious process and may require a lot of individual components to be fine-tuned.

As your user base grows, at some point you will need to scale, you may not need to make that decision now, but the time will hopefully come. If you separate your Database and WFE components earlier in your project it will mean you are more cloud ready and in a position to individually troubleshoot or improve performance of the specific processes that need it when it is needed with little or no effect on the rest of the system.

In the modern cloud first era, a single all-in-one combined server deployment is reserved for early stage POC implementations. I can't think of any commercial scenario to host the database and web host on the same server instance. The risk that your entire database is compromised or corrupted is simply too high, given that for the WFE to work you must expose it to the world, but you do not want your database available to everyone, this could easily lead to regulatory compliance issues.

If your workload is not of commercial scale or you are utilising resources that do not actually cost you anything, not time or money, then perhaps you don't need to change and a single combined server can work for you, OP asked what the benefits are, asking insight into why you would do this at all.

In terms of disadvantages, the cost structure is obviously a bit more complicated if there are multiple resources that you must pay for, but that is really it. In general it is cheaper in terms of raw cost and time to maintain WFE independently of a database server, having the two workloads tightly coupled on same server introduces a lot of OS level dependency issues that get harder to manage over time. The point of splitting the resources so that they can be independently deployed is that it gives you the freedom to manage those resources to find a balance between performance and cost. If hosting separately costs you more, and you are not happy with the performance, then scale back, or perhaps you are doing it wrong. Splitting resources doesn't instantly mean that you will save money, only that you can.

  • 3
    _"I can't think of any commercial scenario to host the database and web host on the same server instance"_ - the entire shared hosting community (i.e. WordPress and other small websites) would want a word. Not every site becomes the next Stack Overflow, not everybody needs to scale, not every customer wants to spend more than $5 a month. I'm not defending the practice, but in practice, it happens a lot. – CodeCaster Sep 18 '22 at 21:05
  • Often driven by the high per core SQL server licensing costs making it better to reduce cores on the database server even if it increases overall hardware costs. – Ian Ringrose Sep 18 '22 at 22:22
  • The reason to scale down is also a big reason to split a database off from the web host, so you can pool resources too, it goes both ways. You would want to pool resources that have similar workloads. Share multiple databases for a single server, host multiple sites on a single web server, deploy your code in a way that you are ready for these things, both your runtime costs and maintenance costs can be reduced in the longer term. – Chris Schaller Sep 19 '22 at 00:35
  • @CodeCaster I've updated the post, I didn't mention that deploying separately makes it easier to reduce costs for small or micro workloads too, or to share the resources across multiple sites, that is how you can get to $5/month TCO or less. It is about being able to manage the cost vs performance. If you're not at a scale where it matters, then we're not really talking a commercial workload of user traffic. All in one is generally more expensive than it would be to split the workloads for the same performance, perhaps even at the $5 side of the market. – Chris Schaller Sep 19 '22 at 01:01