11

I'm a developer at heart - but every now and then, a customer doesn't have a decent DBA to deal with these issues, so I'm called in to decide....

What are your strategies / best practices when it comes to dealing with a reasonably sized SQL Server database (anything larger than Northwind or AdventureWorks; roughly 2-4GB of data plus indexes etc.) - do you use multiple files / filegroups?

If so: how many? And why?

What are your criteria to decide when to move away from the "one filegroup for everything" approach:

* database size?
* database complexity?
* availability / reliability requirements?
* what else?

If you use multiple file groups, how many do you use? One for data, one for index, one for log? Several (how many) for data? What are your reasons for your choice - why do you use that exact number of filegroups :-)

Thanks for any hints, pointers, thoughts!

Cheers, Marc

marc_s
  • 1,132
  • 4
  • 21
  • 35

2 Answers2

16

The basic rule of thumb is to separate files onto different volumes to avoid contention, however the anmount of performance gain you get varies wildly by I/O subsystem and workload. For instance, multiple files on a single physical spindle is going to suck as far as performance goes, but the same arrangement with the volume being on a SAN LUN with several hundred drives from RAID 10 arrays may be just fine. Disk queue length counters are your friend as the simplest way to tell if you've got an I/O bottleneck.

You're looking at the I/O patterns on the databases - read-only, read-mostly, read-write, write-mostly, write-only - and basing things on that. You also need to choose the right RAID level and make sure your disk partition offsets, RAID stripe size, and NTFS allocation unit size are correctly set. Some people like to separate nonclustered indexes into a separate filegroup , but performance gains here vary just as I've explained above.

As well as performance, you should consider manageability and recovererability. Having a single, monolithic data file for a 100GB database means that your unit of restore is that file. Having it split into 4 25GB filegroups means you can use partial database availability and piecemeal restore to only have to restore a single filegroup in the event it gets damaged. By partitioning tables and indexes in multiple filegroups you can also limit which parts of the database are affected by maintenance operations (e.g. index fragmentation removal).

Tempdb is a whole special case, and I'll point you at a blog post of mine that explains all about why and how to split up tempdb - there are lots of misconceptions out there.

Without giving you a 'sweeping generalization' recommendation here, I'll point you at a bunch of whitepapers and blog posts for you to read:

Hope this helps you!

Paul Randal
  • 7,184
  • 1
  • 35
  • 45
  • +1 thanks so much, Paul - great post, great links - excellent – marc_s Jun 21 '09 at 12:17
  • Great answer Paul -> I was trying to find some previously asked questions about SqlServer and hard disk design (eg. TempDB on Bus1_Disk1, My_DB on Bus2_Disk1, etc..).. Time to read.... – Pure.Krome Jun 30 '09 at 06:34
4

The decision to split up a database up in different filegroups should be taken after having analyzed the current size and future growth of your tables. In my opinion unless you have a large database or tables with millions of rows, you should carefully consider pros and cons, since you may end up creating more performance problems than you fix.

There are some scenarios which could be interesting under certain premises:

  • 2 filegroups: data and index
  • 3 filegroups: read-only tables, read-write tables, index
  • multiple filegroups: read-only, read-write, index, key table 1, key table 2, ...

You have to analyze your environment to decide if filegroups will help with your SQL Server growth, usage and performance needs.

Some key indicators to move to multiple filegroups (from this article):

  • When disk queuing is causing application and user experience issues
    • If this is the case, consider leveraging additional disk drives with new filegroups housing IO intensive tables
  • When particular tables are 10% or more of the database
    • If this is the case, consider moving these particularly large tables to separate filegroups on separate underlying disk drives
    • Depending on the table size in proportion to the remainder of the tables, consider building a filegroup for individual table(s)
  • When non clustered index and data space are equal on large tables
    • If this is the case, consider splitting the data and clustered index from the non-clustered indexes
  • When an almost equal percentage of read-only and read-write data exist in the database
    • If this is the case, consider splitting the read-only data in a separate filegroup as the read-write data
  • When insufficient time is available to perform database maintenance
    • If this is the case, consider splitting the large tables into separate filegroups on different underlying disks and perform maintenance in parallel
  • When the business or application will be changing significantly and the data is going to grow at a much higher rate
    • If this is the case, consider working with the users to understand the potential growth
  • When archived data resides in the same database as the production data
    • If this is the case, consider separate file groups or one or more of the techniques in this tip - Archiving Data in SQL Server

If you find that filegroups could be improve your database's performance, write the code and test the process in a staging environment before you implement the changes on your production servers. Prepare some measurements before you implement the changes and compare them before/after. Since these processes can be very resource intensive and time consuming, perform these procedures during a maintenance period.

Don't forget, when creating new objects (tables and indexes), be sure that the objects are created in the correct filegroup to ensure expected performance and periodically validate the database objects are in the correct filegroups and correct as needed.

splattne
  • 28,348
  • 19
  • 97
  • 147