Avoiding extreme fragmentation of compressed system images on NTFS

8

6

Problem explanation

I'm storing windows disk images created with wbadmin on NTFS drive, and I found compressing then with NTFS compression gives 1.5-2× space conservation, still giving full availability for restoring.

But in process of compressing, file get insanely fragmented, usually above 100'000 fragments for system disk image.

With such fragmentation, defragmenting takes very long (multiple hours per image). Some defragmenters even can't handle it, they just skip the file or crash.

The source of the problem is, I think, that file is compressed by chunks which get saved separately.

The question

Is there good (fast) way to get image file defragmented yet keep it compressed (or compress it without causing extreme fragmentation)? May it be some utility to quickly defragment file to continous free space, or some utility (or method) to create non-fragmented compressed file from existing non-compressed?

Remarks based on comments/answers:

  1. External (to windows kernel) compression tools are not an option in my case. They can't decompress file on-the-fly (to decompress 10 Gb file I need 10 Gb free, which isn't always at hand; also, it takes a lot of time); they're not accessible when system is boot from DVD for recovery (it's exactly when I need the image available). Please, stop offering them unless they create transaprently compressed file on ntfs, like compact.exe.
  2. NTFS compression is not that bad for system images. It's rather good except for fragmentation. And decompression does not take much CPU time, still reducing IO bottleneck, which gives performance boost in appropriate cases (non-fragmented compressed file with significant ratio).
  3. Defragmentation utilities defragment files without any regard if they are compressed. The only problem is number of fragments, which causes defragmentation failure no matter if fragmented file compressed or not. If number of fragments isn't high (about 10000 is already ok), compressed file will be defragmented, and stay compressed and intact.
  4. NTFS compression ratio can be good, depending on files. System images are usually compressed to at most 70% of their original size.

    Pair of screenshots for those do not believe, but ofc, you can make your own tests.

  5. I actually did restorations from NTFS-compressed images, both fragmented and non-fragmented, it works, please either trust me or just check it yourself. rem: as I found around year ago, it does not work in Windows 8.1. It sill works in Windows 7, 8, and 10.

Expected answer:

an working method or an program for Windows to either:

  1. compress file (with NTFS compression, and keep it accessible to Windows Recovery) without creating a lot of fragments (maybe to another partition or make a compressed copy; it must be at least 3x faster on HDD than compact + defrag),

    or

  2. to quickly (at least 3x faster than windows defrag on HDD) defragment devastately fragmented file, like one containing 100K+ fragments (it must stay compressed after defrag).

LogicDaemon

Posted 2013-12-26T12:24:05.843

Reputation: 1 681

@DoktoroReichard it depends on the content of the files. Text files and sparse files will have very good compression ratio. Typically I avoid files that are already compressed like zip files, images, audio/video files... and after compressing I often find 10-20% decreased in size – phuclv – 2016-11-28T14:34:49.767

I find it quite odd for NTFS to compress that much (as real-world tests show only a 2 to 5% decrease). Also, NTFS has some safeguards regarding file fragmentation (such as journaling). How big are the files (before and after)? Also, from the picture, it seems Defraggler can't defragment compressed files. – Doktoro Reichard – 2013-12-26T13:54:42.740

1>

  • You can make windows image yourself and compress it. It is really easely compressed at least 1.5x (60-70% or original size).
  • Yes, Defraggler and other defragmeneters CAN defragment compressed files. This is real-world experience.
  • < – LogicDaemon – 2013-12-26T20:39:56.180

    Also, compression ratio is off-topic, but here are real images of real freshely-installed Windows 7 Professional, (mostly 32-bit, 3 or 4 64-bit) systems with standard set of software: http://i.imgur.com/C4XnUUl.png

    – LogicDaemon – 2013-12-26T20:47:22.743

    Answers

    4

    Avoiding fragmentation

    The secret is to not write uncompressed files on the disk to begin with.

    Indeed, after you compress an already existing large file it will become horrendously fragmented due to the nature of the NTFS in-place compression algorithm.

    Instead, you can avoid this drawback altogether by making OS compress a file's content on-the-fly, before writing it to the disk. This way compressed files will be written to the disk as any normal files - without unintentional gaps. For this purpose you need to create a compressed folder. (The same way you mark files to be compressed, you can mark folders to be compressed.) Afterwards, all files written to that folder will be compressed on the fly (i.e. written as streams of compressed blocks). Files compressed this way can still end up being somewhat fragmented, but it will be a far cry from the mess that in-place NTFS compression creates.

    Example

    NTFS compressed 232Mb system image to 125Mb:

    • In-place compression created whopping 2680 fragments!
    • On-the-fly compression created 19 fragments.

    Defragmentation

    It's true that NTFS compressed files can pose a problem to some defragment tools. For example, a tool I normally use can't efficiently handle them - it slows down to a crawl. Fret not, the old trusty Contig from Sysinternals does the job of defragmenting NTFS compressed files quickly and effortlessly!

    Slider2k

    Posted 2013-12-26T12:24:05.843

    Reputation: 41

    2

    Reading the article on Wikipedia about NTFS compression:

    Files are compressed in 16-cluster chunks. With 4 kB clusters, files are compressed in 64 kB chunks. If the compression reduces 64 kB of data to 60 kB or less, NTFS treats the unneeded 4 kB pages like empty sparse file clusters—they are not written.

    This allows for reasonable random-access times - the OS just has to follow the chain of fragments.

    However, large compressible files become highly fragmented since every chunk < 64KB becomes a fragment.

    First things first. WBAdmin is in essence a backup utility that cam restore a full system. So, it's expected that it's output file is large (> 4 Gb). As shown by the quote, large files become rapidly fragmented. This is due to the way NTFS compresses: not by files, but by sectors.

    A good analogy is of a cake being split into several boxes, some of which aren't empty. This is the initial file. The compression part squeezes the pieces of cake, leaving a space in the boxes. As the pieces of cake aren't together, because of the created space, the pieces that make up the cake become fragmented.

    I am still skeptical about NTFS giving out that kind of compression ratio. According to a test made by MaximumCompression on multiple files, NTFS gets the lowest score in compression ratio, a measly 40%. From personal experience I can tell you it's much lower than that, in fact so low that I never bothered to used it nor have I seen it's effects.

    The best way to avoid fragmentation is to stop relying on NTFS. Most defraggers will fail to expand or move the compressed files. If somehow they did, NTFS could not be able to expand the files, or if he could, as the defragmentation process would fill the leftover space from the compression (the 4kB), the expansion would fragment the files, as the file wouldn't be written in the before-contiguous clusters.

    This being said, and if you don't need to read the file constantly, use some of the formats recommended in the above link. 7z and rar are quite efficient (i.e. they compress with high ratios at a decent time). If you care about space and not about time, then choose a PAQ-type algorithm (although you will spend a very long time compressing and decompressing the files). There are also speedy algorithms available.

    If you do need to read the file constantly, don't compress it at all. NTFS is just too damn messy.

    Doktoro Reichard

    Posted 2013-12-26T12:24:05.843

    Reputation: 4 896

    NTFS can achieve quite good compression ratio for files with many duplicated patterns like text files. Moreover since Windows 10 you can increase the ratio by changing the algorithm

    – phuclv – 2017-04-29T06:03:56.053

    This (ref. in Wikipdia) source actually contains good, albeit technical, information about the whole process. – Doktoro Reichard – 2013-12-26T16:28:48.833

    compression ratio is off-topic, but here are real images of real freshely-installed Windows 7 Professional, (mostly 32-bit, 3 or 4 64-bit) systems with standard set of software: http://i.imgur.com/C4XnUUl.png

    With compression I really do conserve disk space. And I can defragment files afterwards, but it takes too long.

    – LogicDaemon – 2013-12-26T20:48:11.330

    And About "avoiding" NTFS compression: I'm doing this for years, and it works almost flawlessly, except for the fragmentation. It really shouldn't be used for frequently-accessed files because of performance issues, but most executables and text is compressed very well. Also, sequentally-written files, like logs don't get fragmented that much still being compressed. And, of course, compressed system images are perfectly restorable, done that many times, and you don't need to explicitly "expand" them, this is obviously done automatically by NTFS driver by-block in-memory. – LogicDaemon – 2013-12-26T20:54:47.347

    1

    I can't really argue with results (and for the record, I did state the actual test results, which were similar to yours, and my personal experience, which seems to be dated). The link I gave in comment does contain information about why it is not possible to avoid the fragmentation problem. Also in the Wikipedia article it states that at boot, Windows has yet to load the compression library for NTFS, not sure about the recovery process also. This might provide insight.

    – Doktoro Reichard – 2013-12-26T22:57:09.560

    Thanks then, but it's not the answer I wanted to get :) I was hoping there will be a way to get compressed but not fragmented file, by writing it sequentially or by blocks of size same as compression chunk. Or there is really efficient defragmentation program, which can quickly defragment single files when there is enough continual free space for it. – LogicDaemon – 2013-12-30T21:14:57.807

    0

    While not exactly what OP asked, I have had good experience with a 3rd party software named Paragon. NTFS by definition trashes your filesystem something horribly when you compress (or sometimes even write) files. This extends to consuming multiple MFT entries, and... It's bad. Microsoft's NTFS driver doesn't even clean this up when a file gets defragmented. Hence, 3rd party tools are required. Paragon allows you to either boot it as an OS in itself (ISO image), or install into another Windows OS with access to the target filesystem. Then you can defragment both the MFT and the files. This is to my knowledge the only way to fix this deficiency in NTFS, short of reformatting the volume.

    (I have no relation to the tool or its creator other than it's the only thing I found to actually work)

    Today, 2 years after the question was seemingly asked, I'd rather recommend deduplication - this can give you upwards of 90% disk savings if the images are just "a little" different. A W2016 Nano Server inside a VM works really well, but I suspect even FreeNAS or anything else using ZFS could handle it.

    Mike

    Posted 2013-12-26T12:24:05.843

    Reputation: 44

    any URL or more precice name than just "Paragon"? Google does not understand me. I know software company named Paragon, but know nothing about its products which will defragment NTFS files (there is MFT defrag tool, but I don't have MFT problems). And thanks for ZFS suggestion, I'll look into it, but again, I need it to be able to boot in the field for quick recovery in-place. – LogicDaemon – 2016-12-08T18:50:23.827

    -1

    Windows has lately been treating ZIP files like folders. ZIP files can be more compressed than NTFS-compressed files and are not inherently fragmented, unlike NTFS.

    Why not test one of your disk images by compressing with 7-zip in ZIP format & see if it is directly usable for restoring?

    If not, play with the 7-zip compression parameters to maximise compression using whatever format is best, eg, 7z. You can increase the compression far beyond NTFS and therefore make much more space available on your disk - though it would be fastest to decompress to a 2nd physical disk or RAM, preferably on a different controller & I-O cable.

    FWIW, compression pays off for non-sandman SSDs for system disks & for non-media files - less wear & tear on the SSD, more space, and faster I/O for non-compressed files. See http://www.tomshardware.com/reviews/ssd-ntfs-compression,3073-9.html

    Video, graphics, and other compressed data files (like .XLSX) are already very compressed, so no benefit to NTFS compression there. Nor for databases or Outlook mail with random updates. But executables, txt, html, etc., files benefit greatly.

    Compression also is an always-win for small files, eg, if <64K compressed, only one fragment. Only hassle would be recovery if there are disk problems.

    tOM

    Posted 2013-12-26T12:24:05.843

    Reputation: 1

    1Man, you're wrong in so many ways… Mainly, windows has never been treating ZIP files like folders. There are technical reasons why this isn't even going to happen (basically, only sequential access to files' contents). Explorer, though it allows managing zips similarly to folders (but in very limited fashion, it doesn't even extract other files when I open an html from zip), isn't Windows. And in question I explained, why separate utilities, be it Explorer or 7-Zip, don't fit (see "Remarks based on comments/answers") – LogicDaemon – 2014-12-27T08:48:46.143

    btw, note about "less wear & tear on the SSD": if SSD hasn't got large enough cache, it's 2 times more wear & tear. Because, when saving compressed file, Windows first saves non-compressed clusters, then compresses them and saves compressed ones (then removes non-compressed). This what causes fragmentation in subject of my question, after all. Samsung SSDs on the link (broken btw, remove "les" from tail) have large enough cache indeed. – LogicDaemon – 2014-12-31T07:36:04.773

    no, Windows never treats zip and cab files as folders. You can view their contents (file/folder names) directly on my computer, but you can't access them transparently like a disk image or compressed NTFS file. You still have to extract the file in the archive some where to view/edit it – phuclv – 2016-11-28T14:38:22.317