11

We design websites for Realty companies. The websites are used only to display the information and all the websites share a common template. We have around 150 websites for different customers. Some third party data providers, provide us all the updates about each listing on the website on hourly basis. The updates for each customer happens during separate hours. On average we have 1000 listings per website. And on every hourly update 80% of the data is changed or updated.

Right now we have a single database in Sql server 2008, for all the customers(Designed initially to cater 10-20 websites). The tables in the database are shared by all. The problem is whenever an update happens, it slows down other customers websites also, that are not at all related to the update. Also deleting a customers data, slows down all the sites.

I am planning to remodel the database by creating a separate schema for each customer, but I am not sure if it is the best way to handle our problem. Having a separate database creates many problems for maintenance(Backups, mirroring etc). Can anyone suggest me a better way to handle this issue. I am not sure how it affects the performance, if I create a separate schema for each customer and isolate their tables form others. Or is there any better solution?

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
kishore
  • 852
  • 3
  • 13
  • 27

5 Answers5

12

My personal preference would be for multiple databases as it allows you to see which databases use the most IO, and RAM, with a variety of DMVs - and it allows you to easily migrate the biggest databases off elsewhere down the line. Additionally, there's a security separation which is always reassuring to clients.

Brent Ozar covered this very issue in a recent blog post - definitely worth a read as he's very good: http://www.brentozar.com/archive/2011/06/how-design-multiclient-databases/

Peter Schofield
  • 1,639
  • 9
  • 11
  • 1
    +1 - These are separate clients, so they should have separate databases and separate sets of DB users (DB Owner, DB read/write, DB read-only) for security reasons if nothing else – voretaq7 Aug 18 '11 at 20:44
  • Thanks @Peter. None of our clients update the data. All the websites run with same server login with minimum permissions to read. The third party data providers, provide us the data, in csv or some other format and we run DTS packages to update the database. Clients donot have access to sql server at all. We have around 150 customers and if I create a database for each customer, then I will end up with 150 databases, that adds a lot of maintenance overhead. Also all the websites share a common template, so database queries do not change for each client. – kishore Aug 18 '11 at 21:08
  • Thanks for sharing the article @Peter. Its is very good. But in our case security is the least concern. We are only worried about performance. – kishore Aug 18 '11 at 21:26
  • 3
    @kishore the maintenance of 150 databases is not a big deal. Scripting and automation should allow you to treat 2 databases and 150 databases about the same. You gain the ability to put "hot" databases on their own disks or even move them to another server. The issue I've come across is trying to accommodate a backup schedule across a large number of databases, and it is a potential issue for you if you need the databases to be transactionally consistent. Otherwise, do not be scared of maintenance. I would argue that a lot of your maintenance will be easier, not harder. – Aaron Bertrand Aug 20 '11 at 17:20
  • I would like to know any performance issue there by using this much schema for intercommunicating the tables in one MSSQL database? That means issue in memory, CPU, or any other resources. Thank you. – felix Sep 17 '14 at 09:50
6

Although I'd probably recommend going for a database per customer, if you want to stick to one database then it looks like schemas will be suitable given your requirements.

However, in addition to a schema per customer I would recommend that you also create a filegroup per customer, then place all the database objects for each customer into that filegroup. There are several advantages to this:

  1. Better database availability. If there is corruption in a data file belonging to one customer, it is still possible to bring the rest of the database online. The corrupt data file can then be restored from a backup even whilst other transactions are taking place.
  2. (Potentially) better performance. As the system grows, you could place filegroups onto different storage devices and spread out your I/O (of course you can do something similar with one filegroup and multiple files, but this will let you split IO by customer).
  3. Ease of backups. Database backups can occur at the filegroup level, giving you more flexibility on when to back up the tables for each customer.
  4. Better restore granularity. Similar to #1, when restoring the database from scratch, once the primary filegroup is online you can begin to restore the filegroups individually, starting with your most important customer. As subsequent restores will not have any impact on already restored filegroups, you can bring your database online piece by piece rather than in one foul swoop. This is called an online piecemeal restore and is an enterprise edition only feature (although it is possible to do something similar in standard edition, but it requires the database to be offline).

I believe the maximum number of filegroups you can have is 32,000-ish so this strategy should last until you really do need to think about splitting into different databases.

For more information I recommend the books online article 'Files and Filegroups Architecture' and the SQLCAT whitepaper on partial database availability: http://sqlcat.com/sqlcat/b/whitepapers/archive/2007/11/21/partial-database-availability.aspx

Rob Watkins
  • 191
  • 5
3

If all customers share the same database, and all websites use the same template, this is more like a multi-tenant cms, and as such having a single database actually makes perfect sense here.

I would not consider the problem to be one customers updates affecting all the others, but rather a lower level of simply slow updates and queries in general. Assuming the hardware and load doesn't change then 150 databases is likely to perform similarly to the single database solution (possibly slower under certain circumstances)

I assume your major table is the 'listings' table that has approx 150*1000 live rows (presumably + historical records).

This is not a massive amount of data by a long shot, and a reasonably specced machine should have no issue.

I'd be looking at ensuring historical records are moved to their own tables. On the read query, I'd add 'read uncommitted snapshot'

On the updates, there are a few possible fixes for this 1. Use a merge rather than multiple update statements 2. Use a cursor to prevent lock esculation 3. Add new records for the updated listings, and just update a 'is current' bit flag on the old ones

Given that this was 2 years ago, what was the eventual solution?

stevenrcfox
  • 131
  • 3
1

It sounds like you're more concerned with keeping everything in one database whatever the cost (though you did bring it up), since Peter's main point for breaking into separate databases was related to performance and being able to scale in the long run. If that's the case then you would want to either split tables out into different schemas for customers OR switch to an Enterprise Edition license so that you can use table partitioning and partition the shared tables based on the customer number.

Jason Cumberland
  • 1,559
  • 10
  • 13
1

I asked myself the same question when i planned to restructure my database infrastructure. This article helped me very much.

[https://msdn.microsoft.com/en-us/library/aa479086.aspx][1]