61

I have been told that you can get a longer lifespan of an SSD if you buy a bigger capacity SSD. The reasoning goes that newer SSDs have wear leveling and thus should sustain the same amount of writing whether you spread this writing on the (logical) disk or not. And if you get an SSD that is twice the size of what you need, then you have twice the capacity to do wear leveling on.

Is there any truth to that?

Andrew T.
  • 107
  • 1
  • 6
Ole Tange
  • 2,836
  • 5
  • 29
  • 45
  • 12
    One thing to remember, that's not explicitly mentioned in any of the answers, but that you seem to be aware of, is that is that a larger SSD is not *in itself* less susceptible to wear or has better wear leveling. The important part is how much of the disk you actually use: if you use 75% of the disk, the controller has only 25% to use when performing wear leveling; if you use 50% of the disk, the controller has 50% of the disk to use for wear leveling. The more space that's available for wear leveling, the more effective the wear leveling will be. We call this "overprovisioning". – Micheal Johnson Nov 16 '17 at 16:39
  • 5
    Modern SSDs can use portions of the disk that are filled with data for wear leveling too. Punch [static wear leveling](https://en.wikipedia.org/wiki/Wear_leveling#Static_wear_leveling) into your favorite search engine. – David Schwartz Nov 16 '17 at 19:01
  • @MichealJohnson: That's not quite accurate - even if the disk is 100% full, the controller can still use all of the disk for wear levelling. That's because it can move pages around, so that even blocks that contain data that never changes (eg the base OS files) can share some of the wear. – psmears Nov 17 '17 at 11:58
  • @psmears Fair enough, but my point was that "larger SSD = better wear leveling" is not always true. The only benefit that one obtains *from using a larger SSD* with regards to wear leveling comes from the fact that less of the disk is used so more space is available for wear leveling. Whether or not the controller can use occupied space for wear leveling is irrelevant, and I'm well aware that controllers do typically do this; the point is that more free space leads to better wear leveling, so storing the same amount of data on a larger SSD leads to better wear leveling. – Micheal Johnson Nov 17 '17 at 12:47
  • 2
    Theoretically getting a larger drive and using less of it will wear out the drive slower because you are using a smaller portion of it. It also helps avoid the read/erase/rewrite cycle that can occur on a nearly full drive and so helps keep the write amplification down but I don't think drives dying due to flash wear is particularly common. A website did a test and every drive they tested lasted well beyond their specified write limit. http://techreport.com/review/27909/the-ssd-endurance-experiment-theyre-all-dead – Evan Steinbrenner Nov 17 '17 at 19:37

8 Answers8

62

This is true, and it was one of the key motivation to backing the switch from SLC (fast and durable flash cells, but small capacity) to MLC (slower and less durable flash cells, but bigger capacity). To give you some ballpark numbers (on old 34nm tech):

  • SLC drive: 100K P/E cycles (program-erase cycles), 100 GB in size, 10 DWPD (Drive Writes Per Day) x 5y, total 1825 TBW (TeraBytes Written);
  • MLC drive: 30K P/E cycles, 200 GB in size, 3 DWPD x 5y, total 1095 TBW.

As you can see, while the MLC drive as less than 1/3 the P/E endurance, due to its bigger size, its total endurance (in Terabyte Written) is 60% of the SLC drive (rather than the expected 30%). An even higher endurance can be achieved with sufficient overprovisioning, bringing relative parity between the two disks.

That said, SSDs rarely die due to NAND wear. Rather, controller and FLT (flash translation layer) bugs are what kill, or brick, flash-based solid state drives. Choosing an SSD, I would put a priority on these things:

  • capacity: as space is never enough, do not underestimate your needs. Bigger disks are (often) also faster than smaller ones, due to more NAND chips available;
  • power loss protection: if used for synchronous writes, be sure to buy a disk with powerloss protected writeback caches;
  • vendor track record: if used for enterprise workloads, do not buy "no-name" SSD or "game oriented" models. Rather, go with a know and reliable vendor, as Intel, Samsung, and Micron/Crucial.
shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • 6
    I seond the note about avoiding no-name brands. I experienced this firsthand with a large-scale qualification. Noname drives experienced all sorts of failures including periodic controller crash and inexplicable bricking. Intel NAND was best as were Samsung controllers (though I think the intel drives started using the Sanforce controller). – jorfus Nov 17 '17 at 17:41
  • Do you recommend Western Digital? – Chloe Nov 18 '17 at 19:06
  • 4
    For a client workload, sure. For a more write-intensive scenario, no. – shodanshok Nov 18 '17 at 19:07
  • Do you have a source for "SSDs rarely die due to NAND wear."? [And does that change in the case of top-vendor SSDs](https://serverfault.com/q/972288/147633)? – ispiro Jun 20 '19 at 20:02
  • @ispiro I've replied on your own question [here](https://serverfault.com/questions/972288/what-is-the-cause-of-most-top-vendor-ssd-crashes) – shodanshok Jun 20 '19 at 21:51
13

SSDs wear out when you use up their block erase cycles. Each block can only be erased so many times. Larger SSDs have more blocks, so that means more block erase cycles. All other things being equal, you can write twice as many TB to a 1TB SSD as you can to a 512GB SSD before it wears out.

Frankly, I wouldn't buy a bigger SSD to get a longer life though. A bigger SSD will cost more. And it's quite likely that you'd prefer to replace that SSD with a newer, bigger, faster, cheaper one when it wears out. Actually reaching the wear out point of a modern SSD takes a long time under most realistic use patterns.

David Schwartz
  • 31,215
  • 2
  • 53
  • 82
  • 1
    You wouldn't buy a bigger SSD because you might want to buy a bigger one? :-D – Phil Nov 19 '17 at 23:43
  • 4
    @Phil It's a common pattern when buying computer hardware. It doesn't usually make sense to buy for anticipated future need for three reasons. First, by the time you actually need what you paid extra for, it's probably obsolete. Second, as you go past the "sweet spot", you have to pay a lot more to get just a little more. Third, by the time you need it, it may cost loss than even just the extra you'd have to pay to get it now. – David Schwartz Nov 20 '17 at 06:37
  • 1
    @DavidSchwartz One factor that is often forgotten is salary for external technician to come and upgrade the hardware. This alone can often push the sweet spot waaay higher. – Ole Tange Jun 20 '19 at 18:23
11

Yes, larger SSDs have higher endurance.

There's a couple of factors involved here, and it's not as simple as it appears:

  • Larger SSDs have more NAND inside them, and any half-decent SSD supports wear leveling so that all the writes are spread evenly over the NAND. As a result, regardless of how much data you put on the drive, the simple fact that there's more NAND inside means that it'll take longer for any single bit of NAND to wear out. If you look at most SSDs on the market, you'll notice that higher-capacity models tend to have higher endurance ratings, and a drive model rated to a given number of drive writes per day (DWPD) will naturally have higher endurance in higher capacities.
  • Another factor which comes into play especially with write-heavy enterprise workloads or when the drive is nearly full is the way NAND-based SSDs work. An important fact about NAND flash memory is that it can write data in small pages but can only erase them in large blocks. As such, it is often necessary to spread out writes across multiple pages, and mark pages as invalid as data is rewritten or deleted. The TRIM command tells the SSD which areas do not contain valid data. SSD controllers try to avoid erasing blocks until all pages in a block are marked invalid, since erasing a block containing valid data necessitates rewriting that data elsewhere, reducing performance and wasting write endurance in the process, a phenomenon called write amplification.
    • This carries the important implication that your data may be taking up more space on the NAND than its actual size. Also, random-write-heavy workloads that frequently replace small chunks of data will tend to cause the drive to use far more NAND than is actually necessary to hold the data as writes are spread out where possible to avoid unnecessary erases and rewrites as well as to ensure that write are spread evenly over the NAND.
    • But this breaks down if the drive is low on space. Although the SSD may appear to have some small amount of capacity remaining from the standpoint of the OS, it is likely to have few or no empty blocks internally. This means that the SSD controller will have no choice but to erase blocks containing valid data and rewrite the data elsewhere, resulting in write amplification. This is why enterprise SSDs are often aggressively overprovisioned, meaning that the drive contains significantly more NAND than is exposed to the OS. This ensures that in the event the drive is logically full, there will still be some space left internally for the controller to rearrange data and avoid excessive write amplification. Simply using a larger drive to hold the same amount of data can achieve this overprovisioning effect. I have a more detailed explanation in this Super User answer.

For most consumer or client workloads, endurance isn't generally something you need to worry about, unless you write lots of data to the drive on a daily basis. However, if you're buying a drive for datacenter workloads like OLTP or databases, then you'll need to pay attention to the endurance ratings, determine how much I/O you expect to put on the drive, and select drives that meet your requirements.

bwDraco
  • 1,626
  • 2
  • 12
  • 25
8

I did a rather large SSD qualification a few years ago for the database fleet of a video website you may have used today. Static Wear leveling wasn't around at the time so I overprovisioned. (manually set max lba to 80% of the drive size). This avoided the pathological edge case where the drive filled up and could not perform wear leveling. People are now mentioning that static wear leveling can avoid that problem. I haven't dug into this, but I'd guess that then you'll want to avoid filling up the drive.

If your choice is between

  1. Large drive from an unknown brand
  2. Smaller drive from one of the top three brands

Go with option 2. Buy from a known manufacturer and plan to not fill it up. I'd just go 20%-50% larger than I know I'll need.

In my qual, my no-name drives failed spectacularly and quite often (controller crashes, total controller failure, drive showing up as 1mb instead of real drive size). After deployment only one drive experienced noticeable NAND wear-out (in a high write production environment with thousands of drives). Drives with the Sanforce controller performed best. Drives with Intel NAND were the gold standard.

jorfus
  • 715
  • 7
  • 14
0

This is definitely true. The reason for this is because bigger SSDs have more “area” to spread the wear over. Since bigger SSDs have more “blocks” to use, each block doesn’t get used as much. Like if you had 10 cars instead of 1, and you drive a different car every day, each one would take longer to need oil changes and such.

0

That's definitely true.

Also be aware those devices (typically) work better (faster and with lower write amplification, which is the ratio between what you write and the amount of data actually written in the NAND) when they have enough free space (typically 10%, more is better).

As others suggested, the money you save buying what you really need will let you buy a bigger and faster SSD sooner since price per terabyte falls over time.

I say Reinstate Monica
  • 3,100
  • 7
  • 23
  • 51
0

This is true, however to really maximize SSD durability, you must choose professional series which allow you to explicitly reduce the available capacity to augment durability. That's why professional SSDs are listed with a range of FWPD values.

wazoox
  • 6,782
  • 4
  • 30
  • 62
  • 3
    You don't need some marketing BS "professional SSD" to do that: simply leave part of the disk unpartitioned to overprovision it. – psusi Nov 16 '17 at 23:13
  • 3
    @psusi pro-grade enterprise SSDs have other things going for them, like power loss protection, larger amounts of raw flash with more overprovisioning, different firmware for more consistent performance, more cooling for constant load, better flash binning (eMLC), etc – Richie Frame Nov 17 '17 at 10:33
  • @psusi no that's not equivalent. If the firmware doesn't allow you to sacrifice explicitly capacity for durability, some of the flash will remain unused instead. – wazoox Nov 18 '17 at 09:31
  • @wazoox, leaving some flash unused is *how* you sacrifice capacity for durability. – psusi Nov 19 '17 at 00:54
  • @RichieFrame, all of the test reports I have read indicate that some drives behave properly on power loss, and some don't; and how expensive the drive is or how reputable the manufacturer does not correlate to which is which. – psusi Nov 19 '17 at 00:56
  • @psusi having all unused flash used for error correction is how you maximize durability. On low end SSDs, unused flash will just be unused. If you declare a reduced capacity on a high-end SSD, the additional capacity will be used for wear-leveling. – wazoox Nov 20 '17 at 06:24
  • @wazoox, no. Unused space on any SSD is available for wear leveling. The whole TRIM thing was created to tell the drive where the unused space is without having to underprovision it. – psusi Nov 20 '17 at 23:15
0

The actual underlying value that you care about is not the disk size but rather its TBW (TerraBytes Written). The guarantee by the vendor is either in TBW or in WPD (Writes Per Day) for a period of time (usually 5 years). The two are interchangeable as TBW=DiskSizeInTB*WPD*5*365.

When a disk is specified with WPD you can have a disk of 1TB with 0.3WPD or a 0.1TB with 10WPD. The smaller disk has a TBW of 1825 and the larger disk has TBW of 547 so the smaller disk has more endurance.

You really want to know what you expect to be the worst case of your usage in terms of TBW and see that the disk holds itself against that with some spares.

TL;DR: Disk size is not full measure of endurance, look or calculate TBW measure and use that for your endurance.

Baruch Even
  • 1,043
  • 6
  • 18