0

I've been using SSDs long enough now to think about protecting them from premature failure due to excessive writes. So, I've already moved /var to other disks successfully (be they other SSDs or HDDs) and it works fine, but I'm a little hesitant to move /run or any other write areas. Perhaps my fears are unfounded.

I am not necessarily familiar with ALL the write areas of the system disk. Of course the most used is /var, and it's my perception that /run is next, but this question is actually asking about ALL of them. If you know of something other than /var and /run that are written to, well, first I'd like to know about them and secondly I'd like to know if they can be safely moved or not.

I just don't know enough about the boot environment in modern Linux (I'm presently mostly on Fedora Server 32). If the disk that these alternative write locations are on aren't already mounted when needed, then this obviously can't work. In looking through the system logs, it's not easy to tell when said transition happens.

Comments received seem to not fully appreciate the main point:

Is it not true that an SSD with only reads will live a lot longer?

If true, which seems to be a foregone conclusion is true from the very definition that they're write-limited, then, isn't it a valuable exercise to move as much write activity off a system disk as reasonable to preserve its life?

I think some people just don't care about that. I'd guess that said people are well financed and can easily afford the man hours and whatever else. But in these trying times, especially during this pandemic when smaller organizations are stressed to their limits, this question is very pertinent. ... It's not the cost of the drive as much as it is the labor to replace it. People who do not fund such things themselves may not appreciate the challenge presented.

Richard T
  • 1,130
  • 11
  • 26
  • How long have the SSD been used for, and how often are they written to? – Christopher H Aug 15 '20 at 03:58
  • @ChristopherH I'm not sure I understand your question. I have been using SSDs as system disks for some time now and I have just recently had a number of them "age out." It occurred to me, as strongly implied above, that if you remove all writes, they might live very much longer. One might have a cheap disposable SSD as your other "disk" and simply replace it whenever needed and avoid the hastle of system down time for the main disks replacement. ... On a system disk, some parts of them are written to an awful lot! – Richard T Aug 15 '20 at 04:01
  • 3
    `/run` is a tmpfs and isn't written to disk anyway. But it sounds like you need to buy higher quality SSDs. – Michael Hampton Aug 15 '20 at 04:09
  • @MichaelHampton I DO buy quality drives - how long are they supposed to live anyway?! And I mean as a system disk with all the log writes and so forth? ... Without taking the time to really look it up, I'd say I've gotten around 5 years out of a drive, but I have SOMETIMES managed to get tripple that from an Enterprise class HDD from a major manufacturer - they just don't have the speed. What's YOUR experience - how long have YOU gotten from them, and do you run a data center? – Richard T Aug 15 '20 at 04:13
  • @MichaelHampton I should have added: Thanks for adding in the info about /run! VERY HELPFUL! – Richard T Aug 15 '20 at 04:14
  • @RichardT I believe my question was very straightforward: I asked how long the SSDs being used have been used for, and how often are they written to. Is the server being used as a web server? Database server? Application server? I'm trying to get an understanding of your use-case to provide the best advice, however after reading your comment reply to Michael Hampton, I don't know if I want to or should. And FWIW, I'm an IT Manager for an security company where we run database applications (both on-premise and the Cloud), and work with a range of OSes. – Christopher H Aug 15 '20 at 04:17
  • There's some good papers on enterprise SSD reliability which may be better than anecdotal experiences, e.g. https://www.usenix.org/system/files/fast20-maneas.pdf – tater Aug 15 '20 at 04:19
  • @ChristopherH OK, well, I've just bought and installed a new round of disks that now are new. So, IDK how to answer you. There are several use-cases in the environment, from lazy developmental machines that see relatively little use to much higher-demand systems that are file servers, a few firewall / gateway systems and a few web servers. So, it's a mix. So, I really don't know how to answer you. And, it was a pretty generic question, no matter what the use case. BUT, your experience is valuable, so please share. – Richard T Aug 15 '20 at 04:22
  • @tetech That's both current and pertinent, though it doesn't answer my question. I'm glad you shared and though I don't have time to read it tonight, I surely will. Trouble is, economics don't always let us "exercise best practices", and sometimes "best practices" are, unfortunately wrong - or maybe just not as well advised for everyone... Thank you for commenting. I still think knowing what we can shift off of the main system disk that's an SSD is a useful exercise. If you disagree, please say why. – Richard T Aug 15 '20 at 04:27
  • @RichardT As long as you are using enterprise-grade SSDs that are designed for servers and datacentres, you would get a minimum of 5-years' warranty. Couple that with a well-planned backup solution and a suitable RAID configuration, and you wouldn't need to be concerned with data loss. If DC enterprise drives are failing when under workloads like what you've described, I'd definitely be questioning the quality. Based on the workloads you mentioned, you don't need to be concerned with drive failures due to excessive writes (as long as you're using drives designed for your use-case). – Christopher H Aug 15 '20 at 04:30
  • The link and comment were provided in response to you asking about specific person's experiences, not in response to the original question. – tater Aug 15 '20 at 04:32
  • @tetech Thanks, tetech, I'll get to it as soon as I can. – Richard T Aug 15 '20 at 04:35
  • @ChristopherH In reviewing I should have added that if I were using SSDs in a database application, I'd be using Postgres and have it replicate to HDDs - no question about that! I've designed many such systems over the years since about version 9 and you're silly if you don't! ... I could add a lot to that, but this is not really the venue. Reach out to me otherwise if you wish, I don't mind helping people. – Richard T Aug 15 '20 at 04:42
  • 1
    My experience probably won't be much help to you since (1) I lease all my servers and have never had an SSD wear out during a 3-year term; (2) I mostly run hypervisors with mixed workloads and don't have anything doing particularly high writes that isn't balanced out by something else that rarely writes. I do have one machine running about 15 cryptocurrency daemons, which are all fairly write-happy. In the 528 days its SSDs (in RAID 1) have been online they've written 11298153 32MiB blocks and have a remaining lifetime of 88%. I don't expect them to wear out before I upgrade the server. – Michael Hampton Aug 15 '20 at 06:02

1 Answers1

2

To sort-of answer the original question, we make embedded systems where we are interested in minimizing writes. The OS partitions are writeable flash, but we mount them read-only, and offload the following:

/etc/{adjtime|resolv.conf}
/etc/lvm/{cache|archive|backup}
/var/{gdm|log|cache|account|spool}
/var/db/nscd
/var/systemd/timers
/var/empty/sshd/etc/localtime
/var/lib/systemd/random-seed
/var/lib/{xkb|puppet|dbus|postfix|dav|dhcpd|dhclient|php|pulse|ups|arpwatch|NetworkManager|gdm|iscsi|logrotate.status|ntp|xen|samba|nfs}

This is for a RHEL-derived system and in our case we put the writeable areas in RAM (data loss is not an issue for us). Granted we are not using SSDs and this is not your exact question, however it should go a long way toward indicating what you should look to offload if you wish to reduce/eliminate writes. You may need more/less than these depending on your packages and config.

tater
  • 1,395
  • 2
  • 9
  • 12