3

We've developed an app for embedded Linux clients (similar to raspberry Pi) and we are using 64 GB MLC Nand for storage. On our tests devices we see significant failure rate of 1/3 approximately. The SSDs reach max capacity of R/W after 6-8 months (instead of 3-5 years). Journaling has been enabled, because in production power loss can happen and it seems likely to be the culprit. Could the journaling be responsible? Our app does not write that much data each day. If we disable it how to deal with data corruption in case of power loss?

Gerry
  • 31
  • 2

1 Answers1

2

Using default mount options, ext4 only journals metadata updates, rather than user data. This means the wear on your disk is going to only marginally decrease, but you expose the device to filesystem corruption in case of power loss (with the obligatoryfsck to recover).

I would investigate what is writing so much data, and why. Then, I would consider if something can be moved to a tmpfs mount (but remember that tmpfs is volatile!)

shodanshok
  • 44,038
  • 6
  • 98
  • 162
  • Your explanation makes sense but I got the information from a manufacturer that journaling is not recommended as it will wear the cells prematurely. This is perplexing. – Gerry Apr 25 '19 at 17:33
  • Quality solid state storage endurance is measured in full drive writes per day. Journaling metadata is a tiny fraction of that. Either your storage has poor endurance or you are writing at very high volumes. – John Mahowald Apr 25 '19 at 18:03
  • @Gerry the key point here is that yes, journal *will* cause additional wear on the flash cells, but *metadata only* journal is light enough that I would not disable it. The only exception is if your NAND is so cheap to lack a proper wear level (ie: it only provides a zone/sector-based wear level or not wear level at all): in this case, the preallocated, continuously-written metadata journal can wear the same cells quite quickly). Can you provide the NAND exact specifications? – shodanshok Apr 25 '19 at 18:21
  • @shodanshok The controller includes wear levelling. The NAND is not cheap, it's from a reputed manufacturer member of the Open NAND Flash Interface Working Group. They checked some of our faulty NANDs and told us that they have reached end of life, some of them after 7 months only. I don't have the exact specs, just a toshiba-based 64 GB MLC commercial grade. We formatted using default so I'm assuming it's (data=ordered) so your assumption would be correct regarding the metadata, but I believe both metadata and actual data are grouped as a transaction unit and journaled. – Gerry Apr 25 '19 at 19:32
  • @JohnMahowald Storage is good quality. We are not writing high volumes at all that's what makes it so weird. We do perform high volumes updates every once in a while (4 weeks to 6 weeks) and the client is downloading roughly a few thousand files of 3-4MB each and that's it. So high volume regarding the average very low volume but as you see not a big load of data anyway. The average storage usage is between 50 and 80%. – Gerry Apr 25 '19 at 19:35
  • @Gerry ext4 with default mount point only journal metadata update; to journal data also, you *must* mount with "data=journal". Moreover, MLC NAND is good for at least 3000 r/w cycles and with a size of 64GB this means ~192TB lifespan for your device. So, to wear off the NAND, one of the following should be true: a) your application is very write heavy; 2) wear level only happens to zone/sector level (ie: each segment only cover 16MB); 3) the NAND controller has a extremely high writer amplification (>100x). – shodanshok Apr 26 '19 at 13:28