Does wear leveling function normally without TRIM?

0

No matter whether it's about UEFI, phones, or SSDs, Samsung apparently isn't particularly good at implementing standards. Unfortunately, some year ago, I bought a Samsung SSD 840 PRO Series SSD for my laptop which I have been using since (this was before all the info about their non-standard-conforming implementations were made public). It's a really nice SSD except for the asynchronous Trim which doesn't work properly, that is to say it deletes data it isn't supposed to delete. Because of this, Linux doesn't use Trim on it so people's data isn't lost (which happened before they disabled Trim on certain Samsung SSD models).

Because I've been using this SSD quite a lot: How strongly does disabled Trim impact the SSD's ability to do wear leveling?

I've not been able to find particularly good and reliable information in what the different attributes of SMART data means. Mainly people and articles basically guessing and contradicting themselves after a few sentences.

This Wikipedia article says:

Each drive manufacturer defines a set of attributes, and sets threshold values beyond which attributes should not pass under normal operation. Each attribute has a raw value, whose meaning is entirely up to the drive manufacturer (but often corresponds to counts or a physical unit, such as degrees Celsius or seconds), a normalized value, which ranges from 1 to 253 (with 1 representing the worst case and 253 representing the best) and a worst value, which represents the lowest recorded normalized value. The initial default value of attributes is 100 but can vary between manufacturer.

First off: How relevant is this given that its heading is "Known ATA S.M.A.R.T. attributes"? Does it apply to SSDs which are connected via SATA?

Why do the values range from 1 to 253? What's with 0, 254, and 255? Are values above 100 even used?

My SSD's SMART data looks like this (according to gnome-disks):

There are no values bigger than 100.

I have many external HDDs but only this one SSD (which is an internal one) so I can't compare its SMART data with the ones of other SSDs I know the amount of usage off. But I suppose the raw value of my SSD's wear level count being 245 means that the SSD's storage cells have been written to 245 times on average. Please tell me whether that's correct, whether reads count, too, whether it's just 245 times the specified storage space (256 GB) or the specified storage space + the reserved storage space (to replace failed parts).

Does my SSD's normalized wear level count being 93 mean that it's almost two thirds through its life ({1, ..., 253}) or that it's pretty well off ({1, ..., 100})?

And one last question: Why does Linux disable Trim altogether with those SSDs if only asynchronous Trim causes data loss?

Output of $ sudo smartctl /dev/sda -a: http://pastebin.com/Prf7NzwN

Related question created by me in response to discussion in comments: https://unix.stackexchange.com/questions/333635/enabling-synchronous-trim-only

UTF-8

Posted 2016-12-26T16:31:54.300

Reputation: 620

Can you post the output of smartctl /dev/sda? (Sanitize serial numbers as needed.) – bwDraco – 2016-12-29T18:56:37.713

@bwDraco I added it to the question. – UTF-8 – 2016-12-29T18:59:37.133

Answers

0

To answer the core question, yes, but your drive is suffering from suboptimal performance. (I've reposted the SMART data to GitHub Gist.)

Your SMART stats indicate that you've written 21.5 TiB to the drive (46248065971 total LBAs written, 512 bytes each). This is 86 full drive writes' worth of host writes (at 256 GiB of raw NAND). However, you've mentioned that the drive's underlying NAND has been written over 245 times. In other words, the drive has written nearly three times the data you've actually sent to the drive. This is called write amplification.

To explain what's going on, I'm going to summarize some of the content in another answer I've written. NAND flash memory consists of a series of blocks, each with a number of pages. Data can be written to individual pages but must be erased in whole blocks, and pages containing data cannot be rewritten until erased. To avoid unnecessarily erasing blocks and rewriting data, SSDs spread out writes across different blocks and mark the old data as invalid; the drive tries to avoid erasing blocks until all pages within each block are marked invalid. This breaks down without sufficient free pages available, in which case the drive is forced to erase blocks containing valid data and rewrite that data elsewhere in order to free up space for new data. Because this means that the same data is written to the underlying NAND more than once, this undesirable behavior is known as write amplification.

The lack of TRIM means that the drive will wind up treating deleted data as valid, reducing the amount of free space available to the drive. The drive ultimately winds up behaving as if it's completely full, even when it isn't.

To answer your question as written, the drive should continue to wear-level properly, distributing writes across the NAND to the extent possible. Knowing Samsung, I'd be very surprised if their algorithms didn't do this properly on a full drive. However, it's constantly rewriting data that it has already written, degrading performance and reducing endurance. Given that TRIM is not available, your best bet is to overprovision the drive, or allocate less than the full (256 GB) capacity for partitions (e.g. 200 GB). However, you'll need to secure-erase the drive for this to work, which means everything will have to be backed up first and restored afterwards. Given the limited capacity of the drive, I'm also not sure whether you're even able to reduce the partition size meaningfully without running yourself out of space.


As for your interpretation of the SMART data:

  • Yes, on Samsung and many other SSDs, the raw value for the wear leveling count is the average number of writes over all of the NAND on the drive. This does mean your drive's NAND has seen an average of 245 full write cycles. See Samsung SSD "Wear_Leveling_Count" meaning.
    • This is based on the raw NAND capacity, including any spare space.
    • Reads do not count towards this number (unless the SSD has to rewrite data due to read disturb).
  • The exact range for normalized values varies with the drive manufacturer, but on most drives, the wear leveling count is on a 0 to 100 range. Your drive is estimated to have about 93% of its write endurance remaining. Most of the other normalized values on this drive are on a 0 to 100 scale as well. (The NAND on the 840 PRO is good for about 3500 write cycles, and 245 cycles is 7% of that figure.)
  • The normalized attribute values 254 and 255 are considered reserved and should not appear on any drive.

bwDraco

Posted 2016-12-26T16:31:54.300

Reputation: 41 701

What's the motivation behind reposting data on Github Gist? Redundancy? I guessed that it has been overwritten 245 times from the SMART data on the screenshot. I have no idea whether that's correct. Is it possible to tell the kernel to enable Trim but not asynchronous Trim? – UTF-8 – 2016-12-29T21:09:53.573

Gist is generally considered a better place to post text samples than Pastebin. Yes, your interpretation of the SMART data is correct. I can't seem to find information on enabling only synchronous trimming, though. – bwDraco – 2016-12-29T21:26:03.053

I think newer Linux kernels enable synchronous trimming only (async or queued trim is disabled) on these SSDs, but not totally sure. What is your kernel version? – bwDraco – 2016-12-29T22:04:24.243

I created this other question after you wrote "I can't seem to find information on enabling only synchronous trimming, though." as I googled quite a bit beforehand, too, and solving the problem should probably be discussed separate from understanding SMART values. My laptop is running Linux 4.4.0-57-generic.

– UTF-8 – 2016-12-29T23:52:56.057

I noticed that timming was on, all along, because Ubuntu has a cron job which executes /sbin/fstrim --all || true once a week. I changed the cron job so it runs once every day and freed up some space on my SSD so I always have 25 to 35 GB of free storage space. However, the number of LBAs written has since only increased to 47'336'457'217 which means I wrote 557 GB, whereas the wear-leveling-count has since increased to 253 which means that 2'048 GB have since been written to the SSD. Write amplification is now at 4 times, not 3. Is there something I can do about this? – UTF-8 – 2017-02-21T22:48:06.883

4 days ago, I changed the cron job so it writes time time before the trim command is executed and the time after it finished to a text file. It takes (pretty consistently) 35 seconds to execute the command each time: https://paste.ubuntu.com/24042979/ I also tested how long the trim command takes to return if it's executed 2 times in a row. The second call returns immediately.

– UTF-8 – 2017-02-21T22:52:32.767

@UTF-8: Write amplification is typically at its worst with random I/O. Also, 25-35 GB isn't much free space; this space can be readily exhausted (from the standpoint of the SSD controller) during normal use of the SSD. If you can't free up more space, you'll want to do more frequent TRIMs to make sure the controller has spare space available; changing it to daily is a good first step. You may also want to run fstrim manually after write-intensive tasks as appropriate. – bwDraco – 2017-02-21T23:07:55.880

The consistent 35 seconds per TRIM is a sign that you're constantly running out of spare space on the drive. It may be best to have TRIM run hourly. – bwDraco – 2017-02-21T23:13:02.180

I now changed it to hourly and enabled write cache for the SSD in gnome-disks. Roughly 60 passed since I wrote my question and I apparently wrote roughly 600 GB during that time. This means about 10 GB per day which is more than I assumed to have written on average per day but only a third of the space I keep free. What do you mean by "from the standpoint of the SSD controller"? That only an integer value of blocks can be written at a time and the LBA write count doesn't care about this? (It ends in 17 which is a bad sign.) – UTF-8 – 2017-02-21T23:33:01.460

What the controller sees as free is not the same as what the OS sees as free. Perhaps it would be best to discuss it in chat? http://chat.stackexchange.com/rooms/118/root-access

– bwDraco – 2017-02-21T23:36:34.297

For the benefit of readers, I've bookmarked the chat discussion here.

– bwDraco – 2017-02-22T00:10:02.250