30

There's a lot of contradictory information about Unix server partitioning out on the internet, so I need some advice on how to proceed.

So far, on the servers I in our test environment I didn't really care about partitioning and I configured a single monolithic / plus a swap partition. This partitioning scheme doesn't seem like a good idea for our production servers. I have found a good starting point here, but it seems very vague on the details.


Basically I have a server on which I will be running a basic LAMP stack (Apache, PHP, and MySQL). It will have to handle file uploads (up to 2GB). The system has a 2TB RAID 1 array.

I plan to set :

/         100GB 
/var     1000GB (apache files and mysql files will be here), 
/tmp      800GB (handles the php tmp file)
/home      96GB
swap        4GB

Does this sound sane, or am I over-complicating things?

voretaq7
  • 79,345
  • 17
  • 128
  • 213
Buzut
  • 765
  • 3
  • 9
  • 23
  • 1
    What is your end goal? What exactly are you trying to accomplish? – Scott Pack Nov 16 '12 at 12:59
  • 9
    However you decide to carve it up, I would suggest using LVM to define your partitions and then allocate space conservatively, leaving some disk space unallocated. Then when you decide you need more space somewhere, you can just extend the LV and filesystem. – ktower Nov 16 '12 at 16:08

5 Answers5

34

One thing to keep in mind when laying out your partitions are failure modes. Typically that question is of the form: "What happens when partition x fills up?" Dearest voretaq7 brought up the situation with a full / causing any number of difficult to diagnose issues. Let's look at some more specific situations.

What happens if your partition storing logs is full? You lose auditing/reporting data and is sometimes used by attackers to hide their activity. In some cases your system will not authenticate new users if it can't record their login event.

What happens on an RPM based system when /var is full? The package manager will not install or update packages and, depending on your configuration, may fail silently.

Filling up a partition is easy, especially when a user is capable of writing to it. For fun, run this command and see how quickly you can make a pretty big file: cat /dev/zero > zerofile.

It goes beyond filling up partitions as well, when you place locations on different mount points you can also customize their mount options.

What happens when /dev/ is not mounted with noexec? Since /dev is typically assumed to be maintained by the OS and only contain devices it was frequently (and sometimes still is) used to hide malicious programs. Leaving off noexec allows you do launch binaries stored there.

For all these reasons, and more, many hardening guides will discuss partitioning as one of the first steps to be performed. In fact, if you are building a new server how to partition the disk is very nearly exactly the first thing you have to decide on, and often the most difficult to later change. There exists a group called the Center for Internet Security that produces gobs of easy to read configuration guides. You can likely find a guide for your specific Operating System and see any specifics they may say.

If we look at RedHat Enterprise Linux 6, the recommended partitioning scheme is this:

# Mount point           Mount options
/tmp                    nodev,nosuid,noexec
/var                    
/var/tmp                bind (/tmp)
/var/log
/var/log/audit
/home                   nodev
/dev/shm                nodev,nosuid,noexec

The principle behind all of these changes are to prevent them from impacting each other and/or to limit what can be done on a specific partition. Take the options for /tmp for example. What that says is that no device nodes can be created there, no programs can be executed from there, and the set-uid bit can't be set on anything. By its very nature, /tmp is almost always world writable and is often a special type of filesystem that only exists in memory. This means that an attacker could use it as an easy staging point to drop and execute malicious code, then crashing (or simply rebooting) the system will wipe clean all the evidence. Since the functionality of /tmp doesn't require any of that functionality, we can easily disable the features and prevent that situation.

The log storage places, /var/log and /var/log/audit are carved off to help buffer them from resource exhaustion. Additionally, auditd can perform some special things (typically in higher security environments) when its log storage begins to fill up. By placing it on its partition this resource detection performs better.

To be more verbose, and quote mount(8), this is exactly what the above used options are:

noexec Do not allow direct execution of any binaries on the mounted file system. (Until recently it was possible to run binaries anyway using a command like /lib/ld*.so /mnt/binary. This trick fails since Linux 2.4.25 / 2.6.0.)

nodev Do not interpret character or block special devices on the file system.

nosuid Do not allow set-user-identifier or set-group-identifier bits to take effect. (This seems safe, but is in fact rather unsafe if you have suidperl(1) installed.)

From a security perspective these are very good options to know since they'll allow you to put protections on the filesystem itself. In a highly secure environment you may even add the noexec option to /home. It'll make it harder for your standard user to write shell scripts for processing data, say analyzing log files, but it will also prevent them from executing a binary that will elevate privileges.

Also, keep in mind that the root user's default home directory is /root. This means it will be in the / filesystem, not in /home.

Exactly how much you give to each partition can vary greatly depending on the systems workload. A typical server that I've managed will rarely require person interaction and as such the /home partition doesn't need to be very big at all. The same applies to /var since it tends to store rather ephemeral data that gets created and deleted frequently. However, a web server typically uses /var/www as its playground, meaning that either that needs to be on a separate partition as well or /var/ needs to be made big.

In the past I have recommended the following as baselines.

# Mount Point       Min Size (MB)    Max Size (MB)
/                   4000             8000
/home               1000             4000
/tmp                1000             2000
/var                2000             4000
swap                1000             2000
/var/log/audit       250

These need to be reviewed and adjusted according to the system's purpose, and how your environment operates. I would also recommend using LVM and against allocating the entire disk. This will allow you to easily grow, or add, partitions if such things are required.

nickgrim
  • 4,336
  • 1
  • 17
  • 27
Scott Pack
  • 14,717
  • 10
  • 51
  • 83
  • 1
    The `noexec` observation is an important one in general -- It's considered good practice to mount `/tmp` with the `noexec` flag to avoid malicious users uploading rootkits through browser security exploits. Similarly `/home` is often mounted `nosuid` since there's no reason for setuid binaries to be there. Re: `/dev` and `noexec`, on many (though not all) modern systems `/dev` is often a `devfs` filesystem and won't let users create/store regular files at all (On FreeBSD it returns "`Operation not supported`", on Ubuntu the `udev` filesystem mounted on `/dev` lets you create regular files.). – voretaq7 Nov 16 '12 at 20:31
  • 2
    @voretaq7: Yeah, using `/tmp` as a jump pad is great fun since it's always there and almost never locked down. – Scott Pack Nov 16 '12 at 20:39
  • Thank you for theses advices. I'll check for noexec as it improves security ! – Buzut Nov 18 '12 at 01:21
14

Ignoring the underlying RAID array (See this question for more details about RAID array levels and when you want to use them), let's concentrate on the core question you're asking:
"How should I lay out my Unix server's fileystems?"


What's wrong with one giant / partition?

As you noted in your question, a lot of Linux distributions (especially the "Desktop" distributions like Ubuntu) use a very simple filesystem layout: / and [swap].

This scheme has the advantage of simplicity -- it's great for DOS/Windows users who are used to their home PC with "the hard drive" as one big monolithic container (C:\) into which you dump stuff, and you don't have to worry about running out of space on filesystems -- just make sure you stay under the disk's capacity and everything is (at least theoretically) fine.

The single-filesystem scheme has several disadvantages though - the most often cited disadvantage is that Unix systems tend to react very badly when the root filesystem fills up (to the point of refusing to boot), and if everything is writing to / (the root) one wayward program or user can take down the whole system.
A single large filesystem is also prone to being a total loss in the event of a system crash and subsequent filesystem corruption.

The issues above, plus a strong sense of organization, is why Unix servers typically have multiple filesystems.


How do you break up the Unix filesystem?

So hopefully you're convinced that having multiple filesystems makes sense. The question now is how do you break the system up into logical chunks, and how do you decide how much space each gets?
The answer is you know and understand what your operating system is going to put where. The starting point for that understanding is the hier man page. Most Unix systems come with (man hier from a linux system, and man hier from a BSD system), and that plus your local knowledge of what the code you are installing is going to do will guide you in creating a sane partitioning layout.

I am going to describe a general partitioning scheme here, but this scheme should always be modified to meet your specific needs.

A General Unix Partitioning Scheme

/
    The "root partition", /, does not usually need to be very large.
    It holds the basic items needed to boot the system, mount other filesystems
    and get you to a running, usable, multi-user environment.  It's also what
    is available to you when you bring up the system in single-user ("recovery")
    mode.  
    The contents of / should not change or grow substantially over time.

    NOTE: Anything that doesn't go on one of the other partitions described
          below will wind up taking space on the root partition (/).

/var
    The /var filesystem holds variable data -- log files, email, and on some
    systems databases (like MySQL or Postgres) store their data files here.  
    `/var` should be "Big Enough" to hold all the data you intend to cram into
    it.  I generally advise 10GB for systems that won't have a database or email
    server (just logs).  If you are building a database or mail server you
    should obviously make `/var` larger, or carve out separate filesystems for
    the database/mail data.

/usr
    The /usr filesystem holds "userland" programs, data, manual pages, etc.
    This is where things like the Firefox browser binary live.  On systems that
    will have a lot of large user applications this filesystem may be very large
    (100GB or more), and on stripped-down servers it may be relatively small.  
    A good rule of thumb is that the /usr filesystem should be twice as large
    as you need it to be in order to fit your initial installation of programs.

/home
    The /home filesystem holds user home directories, and on desktop systems is
    the largest and most prone to filling up.  When you download files from the
    internet, create spreadsheets, store a music library, etc. that data is
    stored in your home directory, and it adds up fast.
    It's important to allow enough room under /home for the "accumulated junk"
    you will gather over time, even on servers -- ad-hoc tarball backups, 
    package files you copied over to install, and the like.

Special Filesystems

/tmp and /var/tmp
    The temporary scratch space (/tmp) is "special" -- on most Unix systems
    the contents of /tmp are cleared on reboot, and on many modern systems
    /tmp is a special "tmpfs" (RAM) filesystem for better performance.
    /var/tmp is usually "persistent temporary files" (like vi recovery
    files), and is not cleared on reboot
    The same general rule applies as for all other filesystems: Make sure
    your temporary scratch filesystems are big enough to hold the stuff you
    want to put in them.

[swap]
    Swap Space is used by the kernel when you are running low on RAM --
    The old general rule of thumb was to have at least twice as much swap
    as you did RAM, however on modern systems it's usually sufficient to
    have "enough" swap -- 2GB is a practical lower limit, and an amount
    between half the installed RAM and the total installed RAM is usually
    adequate.
    On modern systems with relatively huge RAM pools (12G and up) it is
    probably not practical to use the system if it's swapping heavily
    enough to warrant the old "Twice the installed RAM" rule.
voretaq7
  • 79,345
  • 17
  • 128
  • 213
  • 2
    The two reasons you listed are largely obsolete today. ext[234] reserves some space for root and won't allow user programs to use it all up so the system won't run into problems with being out of space, and all modern filesystems use journaling so they won't get corrupted after a crash. – psusi Nov 17 '12 at 00:09
  • 2
    @psusi The space reserved for the root user (usually 5-10% of the filesystem size) does not help you if the root user is the one writing the files that fill up the disk (as is often the case with log files). It is also incorrect to assume that just because a filesystem is journaled it will always be safe from corruption - journaling increases robustness, but it *does not* guarantee safety (particularly if you stumble across an undiscovered bug in the filesystem/journaling code and curdle the journal - The ReiserFS folks can tell some great stories about that from that filesystem's early days). – voretaq7 Nov 17 '12 at 03:10
  • 2
    Having an intact `/usr` or `/var` doesn't help if `/` is corrupted. Likewise having an intact `/` does not help ( much ) if `/home` is corrupted. You end up having to restore from backup either way. Not to mention such failures are one in a million unless you are running a new/unstable fs. – psusi Nov 17 '12 at 04:14
5

The practice of carving up the filesystem like that is from the days when there was no software raid, and disk drives were small, so you had to use several of them, and thus, the only way to do that was to break the filesystem up and put different directories on different drives. The other historical reason for it was so that you could easily unmount a partition and dump it for backup, which you could not do with the root. This tool has largely fallen out of favor these days and can instead be used on an LVM snapshot even on the root.

There is little to no reason to do this any more. About the only reason left to do this is if you want to, for instance, prevent /tmp from filling up the whole disk.

This reason is largely irrelevant these days because provding users with general shell access has gone by the wayside, and these days servers run dedicated services, such as web or mail servers. Since you don't have random users able to run arbitrary commands, you generally don't need to worry about them trying to fill up your filesystem ( and even when you did, you had disk quotas to stop that ).

As for what raid level to use, you need to remember that the main purpose of raid is not to protect data ( that's what backups are for ), but to maintain uptime. If you put /tmp on a raid0, then your server would still go down and you'd have to go repair it if one of the disks fail. You also might want to use raid10 instead of raid1 so you get better performance as well.

A very good reason NOT to break up the filesystem is that if you get the allocations wrong, you can end up with part of the filesystem being full despite there being plenty of free space elsewhere. Correcting this can be difficult, unless you use LVM and left some unassigned space.

psusi
  • 3,247
  • 1
  • 16
  • 9
  • 4
    There's lots of reasons for continuing to carve up a Unix filesystem in the traditional way. If there were no reason for doing this we would have stopped by now - sysadmins aren't *THAT* attached to arcane traditions :) – voretaq7 Nov 16 '12 at 20:10
  • 1
    @voretaq7, then name some. If you can't, then blindly assuming there must be is foolish. – psusi Nov 17 '12 at 00:04
  • 1
    It would behove those downvoting to actually provide a counter argument rather than blindly parrot conventional wisdom. – psusi Nov 17 '12 at 00:34
  • 2
    Keeps /var/log from taking everything by filling up. Limits corruption to a filesystem. Simplifies backups -- whether snapshots or mount traversal rules, one often wants to back things up on different schedules. Simplifies imaging / upgrades. Allows selection of filesystems based performance related to the task. – Jeff Ferland Nov 17 '12 at 02:00
  • @psusi See the (rather long) answers Scott and I wrote detailing some of the reasons. Jeff also hit a few more good ones in his comment. – voretaq7 Nov 17 '12 at 03:03
  • 1
    @JeffFerland, at best that is a weak reason to put /var/log on its own partition, but not for the several other partitions. Unless you are still using `dump`, backing up different parts of the fs does not need those parts to be on different partitions. Upgrades do not care one way or the other. Imaging isn't a very good way to do things either. – psusi Nov 17 '12 at 04:07
3

A lot of the partitioning information was generated when disk space was i short supply. As a result you will see relatively small partitions for a number of cases. Required partition sizes vary depending on server usage. The most variable tend to be /tmp, /var, home, /opt, and /srv. /usr tends to be of a reasonable and stable size. Space for / can include any or all of the other partitions and their space requirements. Sizing is really dependent on what you are doing the system.

I would increase swap and mount /tmp on tmpfs. You /tmp will then use swap as a backing store, but use memory as available. The size of your /tmp looks extremely high, but will handle aborted upload which aren't cleaned up.

I would consider moving the MySQL files to /srv. This is a relatively new level in the disk hierarchy.

If you don't known your ultimate requirements consider using LVM and expanding your partitions as the fill up.

voretaq7
  • 79,345
  • 17
  • 128
  • 213
BillThor
  • 27,354
  • 3
  • 35
  • 69
  • Be careful increasing swap -- It's good to have "enough" swap, but if you have too much you'll never use it (because by the time you're swapping that heavily the system performance is just too painful). I would say the 4G proposed in the question is probably "enough" for a LAMP stack -- if you're using 4G of swap (and actually paging that data in and out) you're probably also on the phone being screamed at because the website is slow :) – voretaq7 Nov 17 '12 at 03:15
  • 1
    @voretaq7 It doesn't matter what size swap is if you are using for active programs. Using it for tmpfs where large files get written out to disk but smaller files stay memory resident is a reasonable use of swap. It saves on writing every file to disk when the intent is to put it elsewhere. I suggested increasing swap space because it seemed a large `/tmp` space may be requred. – BillThor Nov 17 '12 at 17:47
  • Why not use a regular file for swap? Haven't they been as fast as a dedicated swap partition for a long time? – Chris Smith Nov 21 '12 at 18:11
  • @ChrisSmith A regular file should be nearly as fast as a dedicated partion, but it may not be contiguous on disk leading to split I/O requests. This may be made up for by striping. Also, it is relatively easy to accidentally delete a swap file. The deleted file will not be evident until the system is rebooted, when it won't have the swap space any more. – BillThor Nov 21 '12 at 22:14
  • @BillThor This is true - if you're using `tmpfs` and expect to hit swap as the backing store you should have "enough" swap to meet your tmpfs demands, plus an appropriate reserve for the system as well. (This isn't something I normally think of since the only system where I use `tmpfs` is configured not to hit swap, since it has a RAM surplus and I'm using the temporary space for tiny files that get created/deleted quickly :) – voretaq7 Nov 22 '12 at 07:49
2

Depending on your architecture - you may not want to actually use /tmp as it's cleared out after every reboot. If your site deals with eventual processing of uploads, changing this to another location (via php.ini) may be an idea; in which you can make it any mount point.

As suggested earlier, it's highly recommended to use LVM and increment as needed.

I'd also highly recommend a dedicated partition for MySQL data (you can still mount it under /var/lib/mysql).

thinice
  • 4,676
  • 20
  • 38
  • It's generally a good idea to assume that files in `/tmp` may not be there later -- saves you from unpleasant surprises later :-) – voretaq7 Nov 22 '12 at 07:50