9

I have Linux server and I have a spare 500GB disk partition. I wanted to format it and use it for /tmp. The server occasionally runs some large data processing tasks so it can happen that /tmp will hold GBs of temporary data.

Then I got an idea that instead I could add it as a swap partition instead and mount /tmp to tmpfs. Is this idea reasonable?

The server has 6GB of RAM, so in most cases data on /tmp would be only in RAM, with the obvious speed advantage. The question is, what if there will be let's say 10-20GB of data on /tmp, how will the system perform? What would be the performance compared to having simply /tmp mounted to an ext4 partition? Thanks for help.

Edit: It is clear that the system will start swapping out memory when the usage of tmpfs hits the RAM limit. But is Linux smart enough to swap out tmpfs data and keep "regular" data in RAM? If yes, then I suppose it could behave reasonably. If not, then the whole system will be severely affected.

Petr
  • 501
  • 1
  • 5
  • 13

3 Answers3

12

This is NOT A Good IdeaTM.

You'll be fine with a large /tmp partition, mounted like this (from your /etc/fstab)

tmpfs  /dev/tmp  tmpfs  defaults,nosuid,nodev,noexec,noatime,nodiratime,size=6000M 0 0

And you could add your external drive as a giant swap partition

/dev/sdb1  swap  swap  defaults  0 0

When that hits its limit, your machine will start to swap the pages from RAM to disk - at which point, load averages will go through the roof and the machine will grind to a halt.

Its a bad idea to rely on SWAP in any way, you'd be better off selling your 500GB drive and simply buying more RAM - its cheap.

In summary

If you really want to use your 500GB disk, you could mount your 500GB disk on /tmp with a non-journaled filesystem with atime and diratime disabled (eg. ext2). That would be substantially faster than dealing with a machine that is SWAPing

Fran
  • 332
  • 3
  • 8
  • 1
    This is a TERRIBLE idea. Using tmpfs and relying on it to be pushed to swap sounds like an idea... but the reality is that your system may push the wrong things to swap in-favor of keeping tmpfs in physical RAM. You'd be better off simply mounting your 500gb partition to /tmp and be done with it. I'd suggest using an extremely light-weight file-system like xfs of riserfs... as you can quickly re-format it on startup or whenever the need arises... and you really don't need all the advanced features of ext2/3/4 and such. – TheCompWiz Sep 27 '12 at 14:05
  • 3
    Why the downvote? I wasn't condoning this action - but providing a means on how to do it. The question was whether it was possible and how to go about it - not whether it was a good idea or not. I had explicitly stated that this is A Bad Idea - and that the actual solution should just be to buy more RAM if he needs faster `tmp` access. – Fran Sep 27 '12 at 14:15
  • 1
    This is a place for *good* ideas... not duct-tape & bubble-gum solutions. People come here for help... and not for ideas on how to make their lives hell. – TheCompWiz Sep 27 '12 at 14:20
  • I edited my answer to suit your comment, thanks for your input – Fran Sep 27 '12 at 14:25
  • @TheCompWiz I wasn't really asking how to do it, I was asking if it was a good idea or not (which you have answered in your comments). – Petr Sep 27 '12 at 14:49
  • 3
    Everybody here seems to agree that this is a bad idea. However, the tmpfs.txt on kernel documentation lists this as one of the use cases for tmpfs. @Fran, do you have data to support your scary assertion that it would cause thrashing? Obviously this depends on what will go on /tmp. – migle Jan 09 '15 at 16:00
  • noatime implies nodiratime, see [link](https://lwn.net/Articles/245002/) – jarno Jan 23 '16 at 22:08
  • 1
    proper answer: https://serverfault.com/a/871677/67675 – poige Jan 17 '19 at 19:47
4

This could be a reasonable idea.

Putting an actual filesystem on /tmp does incur overheads, because filesystems go through great lengths to make sure that the data on disk is not corrupted in case of system failure. For a /tmp that is cleaned at boot time, that is obviously just overhead. Using a tmpfs would avoid that overhead.

On the other hand, filesystems also make sure that files are organised on the disk in a way that optimises access time - i.e., they will avoid fragmentation. Typical sequential file accesses will (mostly) result in sequential disk accesses, which are more efficient than random accesses. This effect is more pronounced on spinning harddisks than on SSD. The swap+tmpfs combination can't easily do this, because swap is not aware of which piece of memory belongs to which file and tmpfs isn't aware of how pages are mapped to physical memory or to the disk. For large files, however, it should work well, since both tmpfs and swap try to keep things contiguous in that case. At least, as long as there is a lot of free space on swap (otherwise fragmentation kicks in), and writes happen slowly enough that they get a chance of being swapped out.

So the bottom line is: it depends, you should try both options to see which one works best.

When you mount the tmpfs, do remember to set the size explicitly. The default is half the physical RAM, so just 3GB.

Arnout
  • 141
  • 3
2

This is actually a good idea when you usually don't have much data in /tmp, but occasionally consume endless gigabytes for limited duration. The problem is the linux swap system doesn't know enough about your use case to do it right. It will generally prioritize dumping or swapping cache over program pages, but that doesn't really help. It may be possible to use cgroups to achieve your goal, it is when the scratch data is held in program memory, but I'm not sure how to configure cgroups in this case (I suppose you could use a FUSE tmpfs...). Fortunately, that's not required. You can get the desired behaviour with zram and a backing device.

zram-init is the program that automates setting up zram, which is a compressed ram block device. There is usually an example in the zram-init config for mounting /tmp as zram. It'll be something like the following

type0=/tmp
flag0= 
size0=524288 # 500G of logical space
mlim0=2G # 2G of memory
back0=/dev/loop0 # (or /dev/sdxN, your large slow drive)
notr0= 
maxs0=4 # maximum number of parallel processes for this device
algo0=zstd 
labl0=tmp # the label name
uuid0= 
args0= 

This will compress and store in memory anything written to /tmp. Usual compression is somewhere around 50%. It will consume at most 2G of physical memory. If it runs low on physical memory, it will take the oldest files and push them into the backing device, still compressed. Note that it does incur some CPU overhead to compress and decompress the files, but this is usually offset by the reduced IO.

A similar setup can be used in conjunction with cgroups to let certain processes swap without adversely affecting overall system performance.

Perkins
  • 141
  • 1