How does NTFS compression affect performance?

59

14

I've heard that NTFS compression can reduce performance due to extra CPU usage, but I've read reports that it may actually increase performance because of reduced disk reads. How exactly does NTFS compression affect system performance?

Notes:

  • I'm running a laptop with a 5400 RPM hard drive, and many of the things I do on it are I/O bound.
  • The processor is a AMD Phenom II with four cores running at 2.0 GHz.
  • The system is defragmented regularly using UltraDefrag.
  • The workload is mixed read-write, with reads occurring somewhat more often than writes.
  • The files to be compressed include a selected subset of personal documents (not the full home folder) and programs, including several (less demanding) games and Visual Studio (which tends to be I/O bound more often than not).

bwDraco

Posted 2012-04-12T17:12:10.953

Reputation: 41 701

12I think the only right answer is "measure it on your system". – user541686 – 2012-04-12T17:15:39.480

I think this should stay a generic question. CPU is faster than Memory. Nowadays. Let's assume that. About the performance? No idea, but I'm curious too. – Apache – 2012-04-12T17:25:51.670

1What system is it? how many cores do you have ? being cpu intensive operation , will you have a bit more extra cpu relative to the hard drive speed for the operations you are going to be doing? The effect on power consumption and temps. The compressability of the data. how much is Read, and how much is write? Compressing it to begin with is slow, but reading it back (depending) should be faster by easily measurable ammounts. – Psycogeek – 2012-04-12T17:30:50.450

Related (but with slightly different circumstances; specific to a folder with many icons): http://superuser.com/questions/38605/does-compressing-files-in-xp-slow-things-down

– bwDraco – 2012-04-12T17:34:37.370

File sizes, isnt the windows compression done in 64K hunks? and what is the cluster size your using now? (which is sort-of the hunks your talking in now) What about recoverability? – Psycogeek – 2012-04-12T17:40:30.127

1Btw one thing you can try, is defrag. I heard wonders about UltimateDefrag, but I never tried it so far. (Amongst Diskeeper and PerfectDisk, I use the latter, since Diskeeper stopped releasing new versios, etc.) – Apache – 2012-04-12T17:43:28.840

I doubt defragging a compressed system will improve performance.. because the real data is compressed and stored using the driver in its own ordered. Degragging the high level data will not organise the compressed data. AND- why do you want to compress it any way? Usualyl peple encrypt on the fly as HDD are massive. What size HDD you got? 10gb???? I would suggest NOT to compress as its unnecessary processor time.. especially if playing games. – Piotr Kula – 2012-04-16T11:47:19.953

@ppumkin compressed files are stored in the same way, albeit in slightly bigger "units". You can defragment a compressed file system, and it will improve performance. – Breakthrough – 2012-04-16T12:41:35.657

An important factor is the compression rate that you obtain; but I think that the speed gain is not significant compared to the other advantages/disadvantages of compression. And it may be more useful for preserving SSDs than speeding normal HDDs – clabacchio – 2012-04-16T12:45:20.757

@Breakthrough you're right; I was saying that since SS memories (also Flash disks) have limited writing cycles, it MAY have sense to compress data. About this case, I think that speed gain (if any) is secondary compared to other factors. For instance, you may gain .5 ms but increasing the load on the CPU during a though task. It's hard to say if it's convenient, adn you'll never gain minutes. That's my point. – clabacchio – 2012-04-16T12:51:20.237

@clabacchio I see what you're saying now, and I suppose it would theoretically increase the SSD lifespan. – Breakthrough – 2012-04-16T12:52:45.940

@Breakthrough it's also true that if you buy an SSD you want speed, and lifespan is secondary. But it was just worth saying imho :) – clabacchio – 2012-04-16T12:54:24.070

@Breakthrough So what you are saying is that a compressed file system, is a file on the hard drive and can be physically accessed and defragmented? I don't think so. Grouping uncompressed data ontop of a compressed file system has no affect on the comrpessed file system unless the driver of that file system actively groups data! for example ZFS actively maintains data fragments even if they are spread across several HDD's. NTFS in it self does not, neither will the comprresed version of NTFS – Piotr Kula – 2012-04-16T13:05:31.273

@ppumkin see this article. The 64 kB chunks themselves can get fragmented on-disk. Placing them sequentially will improve throughput by eliminating seek time between subsequent compression units.

– Breakthrough – 2012-04-16T13:16:22.370

Answers

36

I've heard that NTFS compression can reduce performance due to extra CPU usage, but I've read reports that it may actually increase performance because of reduced disk reads.

Correct. Assuming your CPU, using some compression algorithm, can compress at C MB/s and decompress at D MB/s, and your hard drive has write speed W and read speed R. So long as C > W, you get a performance gain when writing, and so long as D > R, you get a performance gain when reading. This is a drastic assumption in the write case, since Lempel-Ziv's algorithm (as implemented in software) has a non-deterministic compression rate (although it can be constrained with a limited dictionary size).

How exactly does NTFS compression affect system performance?

Well, it's exactly by relying on the above inequalities. So long as your CPU can sustain a compression/decompression rate above your HDD write speed, you should experience a speed gain. However, this does have an effect on large files, which may experience heavy fragmentation (due to the algorithm), or not be compressed at all.

This may be due to the fact that the Lempel-Ziv algorithm slows down as the compression moves on (since the dictionary continues to grow, requiring more comparisons as bits come in). Decompression is almost always the same rate, regardless of the file size, in the Lempel-Ziv algorithm (since the dictionary can just be addressed using a base + offset scheme).

Compression also impacts how files are laid out on the disk. By default, a single "compression unit" is 16 times the size of a cluster (so most 4 kB cluster NTFS filesystems will require 64 kB chunks to store files), but does not increase past 64 kB. However, this can affect fragmentation and space requirements on-disk.

As final note, latency is another interesting value of discussion. While the actual time it takes to compress the data does introduce latency, when the CPU clock speed is in gigahertz (i.e. each clock cycle is less then 1 ns), the latency introduced is negligible compared to hard drive seek rates (which is on the order of milliseconds, or millions of clock cycles).


To actually see if you'll experience a speed gain, there's a few things you can try. The first is to benchmark your system with a Lempel-Ziv based compression/decompression algorithm. If you get good results (i.e. C > W and D > R), then you should try enabling compression on your disk.

From there, you might want to do more benchmarks on actual hard drive performance. A truly important benchmark (in your case) would be to see how fast your games load, and see how fast your Visual Studio projects compile.

TL,DR: Compression might be viable for a filesystem utilizing many small files requiring high throughput and low latency. Large files are (and should be) unaffected due to performance and latency concerns.

Breakthrough

Posted 2012-04-12T17:12:10.953

Reputation: 32 927

The fragmentation that NTFS compression creates (and that it adds when a file is modified) will easily strip any performance increase. If you have a very compressible data set that won't be modified often and you defragment after compression, it can be a net gain. Modification after compression will cause nasty fragmentation. Re: benchmarks: even slow CPUs today are fast at LZ. The fragmentation issue is the biggest problem by far. It's a classic case where an optimization is only useful in limited contexts. Choose what to NTFS compress very carefully and it will be an overall win. – Jody Lee Bruchon – 2015-12-04T12:19:37.190

1And what about SSDs? – Violet Giraffe – 2016-10-02T15:55:27.810

How could I measure C, D, W, and R? – Geremia – 2017-07-19T13:45:41.497

1I'd appreciate some practical typical examples for those abstractions "C > W and D > R"? Is it for example beneficial to compress "Program Files" and/or "Windows" on a 4-core Laptop with HDD ? and with SSD? Will battery consumption be affected significantly ? – kxr – 2019-04-13T20:33:40.763

Can you link any good Lempel-Ziv based compression/decompression algorithm based benchmarks ? – user1075375 – 2013-08-08T21:20:30.290

7

You have a quite slow disk, so your question does have merit. NTFS compression is processor-intensive and is tuned for speed rather than compression efficiency.

I would expect that you would see a (very) small improvement for read operations. However, when accessing a file residing in the system cache you will have a performance hit, since it will have to be decompressed again on every access.

You will of course see that write operations will be slower because of the additional compression.

Copying files on this same NTFS disk requires decompression and compression, so these will suffer the most.

NTFS Compression can also increase fragmentation significantly, but this is not a problem for most 'typical' computers under 'typical' work loads.

Many types of files, such as JPEG images or video or .zip files, are basically uncompressable, so these files will be slower to use and without any space saved.

Files smaller than one disk cluster (typically 4K) are not compressed, as there is no gain. However, even smaller cluster size is sometimes advised when compressing the entire volume.

NTFS compression is recommended for relatively static volumes or files. It is never recommended for system files or the Users folder.

But as hardware configuration varies from one computer model to another, depending on disk, bus, RAM and CPU, only testing will tell what the exact effect of compression will be on your computer model.

harrymc

Posted 2012-04-12T17:12:10.953

Reputation: 306 093

5

I explained it here in the Wikpedia entry for NTFS:


NTFS can compress files using LZNT1 algorithm (a variant of the LZ77 [23] ). Files are compressed in 16-cluster chunks. With 4 kB clusters, files are compressed in 64 kB chunks. If the compression reduces 64 kB of data to 60 kB or less, NTFS treats the unneeded 4 kB pages like empty sparse file clusters—they are not written. This allows not unreasonable random-access times. However, large compressible files become highly fragmented as then every 64 kB chunk becomes a smaller fragment. [24][25] Compression is not recommended by Microsoft for files exceeding 30 MB because of the performance hit.[citation needed]

The best use of compression is for files that are repetitive, written seldom, usually accessed sequentially, and not themselves compressed. Log files are an ideal example. Compressing files that are less than 4 kB or already compressed (like .zip or .jpg or .avi) may make them bigger as well as slower.[citation needed] Users should avoid compressing executables like .exe and .dll (they may be paged in and out in 4 kB pages). Compressing system files used at bootup like drivers, NTLDR, winload.exe, or BOOTMGR may prevent the system from booting correctly.[26]

Although read–write access to compressed files is often, but not always [27] transparent, Microsoft recommends avoiding compression on server systems and/or network shares holding roaming profiles because it puts a considerable load on the processor.[28]

Single-user systems with limited hard disk space can benefit from NTFS compression for small files, from 4 kB to 64 kB or more, depending on compressibility. Files less than 900 bytes or so are stored with the directory entry in the MFT.[29]

The slowest link in a computer is not the CPU but the speed of the hard drive, so NTFS compression allows the limited, slow storage space to be better used, in terms of both space and (often) speed.[30] (This assumes that compressed file fragments are stored consecutively.)


I recommend compression only for files which compress to 64KB or less (ie 1 piece). Otherwise, your file will consist of many 64K or less fractions.

MyDefrag does a better job of defragging.

TomTrottier

Posted 2012-04-12T17:12:10.953

Reputation: 51

My experience with UltraDefrag is that it does a decent job, giving a more complete defrag than the Windows built-in defragmenter, but as far as I know, it isn't exactly as smart as MyDefrag. I'm using the version 6 beta, which has a few bugs and unimplemented features, but is much faster than previous versions. – bwDraco – 2012-12-23T01:08:13.477

1

It will make operations slower. Unfortunately, we cannot measure exactly how much or how little it will affect your system. When a file that is compressed gets open, it takes processor power to uncompress the file so the system can use it; when you are done with it and hit Save, it uses more processor power to compress it again. Only you can measure the performance though.

Canadian Luke

Posted 2012-04-12T17:12:10.953

Reputation: 22 162

4I think you missed the entire point of the question. There's a trade-off between taking longer to compress/uncompress data, and taking less time to read data off the disk (by virtue of reading less data). So your assertion isn't guaranteed. An obvious example where compression can easily be a win is when you're reading off a network file system. With a local file system, it's less clear, but not guaranteed to go one way or the other. – jjlin – 2012-04-12T19:09:06.890

@jjlin Do you have an examples of when it's faster? – Canadian Luke – 2012-04-12T19:58:43.590

@Luke let's assume your CPU, using some compression algorithm, can compress at C MB/s and decompress at D MB/s, and your hard drive has write speed W and read speed R. So long as C > W, you get a performance gain when writing, and so long as D > R, you get a performance gain when reading. – Breakthrough – 2012-04-16T11:52:43.597

@Luke it is faster when using it on slow drives, like old IDE drivers or USB 1.0 pen drives. – kurast – 2013-04-01T16:02:12.220

-2

anybody who sees this today should be aware that, in the case of video games, yes even regularly patched ones, enabling compression on the drive or folder can decrease load times, even on slower cpus of today, and even on ssd's (other then the fastest ones most people dont have), you need to defrag regularly though, and i strongly recommend buying perfect disk, once you use its "Smart agressive" defrag, AFTER compression, leave its auto prevention of frangmentation feature enabled, it will keep an eye on activity and auto-optimize to avoid fragmentation, at very little to no perf hit(testd this all the way back to old first gen quads of both amd and intel on modern windows recently infact)

many game files compress insanely well, some games have files that take up disk space, despite being mostly blank...one game i compressed a while back went from 6gb in one of its folders to under 16mb..... (wish i was kidding...talk about wasted space and wasted I/O....)

compressed a buddies steam folder a while back, took it 4 days to compress(its on a 4tb drive and it started 3/4 full), when it was done....he was using around 1/3 of the drive total, defrag took another day(but, it started out horribly fragmented because, he had never done a defrag on it, ever..despite multi mmo's on it...and shitloads of steam/uplay/origin/etc games on it...)

DO NOT compress you pictures/images folders, it wont do any good, and will just make accessing them slower on slow systems(wont even notice on a 1/2 decent rig though...)

i have compressed my drives on every system since nt4, BUT, selectively, i will actually decompress folders where compression does more harm then good, its the "best practices" we came up with way back in the day as gamers, geeks, "it"guys(before that was a term), and, its stayed true, honestly, i wish they had a more fine tuned way to compress drives/data, there use to be a tool that wasnt free but affordable, that got you much better compression results without compressing any data that shouldnt be compressed....

anyway, even many older dual core systems actually benefit overall if you 1. run ccleaner 2. run chkdsk /f from elevated command prompt (type y then restart and let it run the check) 3. compress the drive. 4. defrag with either mydefrag or better, perfect disk, this will take time.. 5. fine any folders containing large files or pictures/other content that dosnt compress well/at all, decompress the folder or just the files, my exp here is, you rarely have to defrag after this part of the process but, its best to check.

i get why some people are against compression, but, having tested it, when used properly, ssd or hdd, and especially slow old hdd's and ssd's, compression when used properly can seriously help not only save space but, performance, even most older dual cores can deal with the average compress/decompress cycles faster then the drive in those systems can move, having tested this, first gen and cheaper older design ssd's, can benefit from compression, not as much as slower hdd's in most cases but, a buddy has a netbook thats got a VERY slow, hard to replace ssd in it, as well as a much easier to replace easy to access ssd slot, but, the stupid thing CANNOT boot from the added ssd without removing the other one physically... (horrible bios, but...for what the unit is, its actually nice, more powerful then it looks..outside the slow ssd thats installed in such a way you have to take the whole thing apart to get to it......), compressing that drive and just having windows and the most basic of apps(like office) on the slow ssd actually sped it up, even in read/write, because its cpu actually ends up waiting for the damn ssd..it dosnt for the faster one he insalled...i suggested just putting the boot loader on the internal ssd and the os on the added but..hes hoping to eventually kill the stupid thing by using most of it for the page file....(its 128gb but, ungodly slow, like i have usb3 flash drives that have better write speads....that cost all on sale at newegg/amazon......)

i STRONGLY suggest compressing at least your games drive/folder...my god the dif that can make on even fast systems!!!

Una Salus Victis

Posted 2012-04-12T17:12:10.953

Reputation: 9

most games already compress their data, that can't be compressed further – M.kazem Akhgary – 2019-05-31T11:53:14.130

-2

Windows compresses non-recently used data in RAM, with even SSDs being a fraction of the speed I'd guess the performance hit is a non-issue. I'm more concerned about compressed blocks that develop a 1-2 bit error and can't recover some or all of data... or a dictionary error in worst case. Anything that produces a disk that's non-readable on alternate OSes and potentially lowers reliability isn't worth the extra speed it (might) bring, IMHO. Videogame texture pack files and such are usually already compressed, so I don't see how layering another set of compression will improve things. I'd like to see an OS that supports marking files as linear layout on the disk geometry so random r/w isn't used. It speeds things up even on SSDs for certain use cases. My other problem with compression is that since images and movies are already compressed, as are MS Office docs and tons of other formats, you're stuck marking files as compressible and micromanaging it. For a linux source tree or large open source project it could help a lot since compression is usually optimal on text files.

Guesty McGuesterson

Posted 2012-04-12T17:12:10.953

Reputation: 1

1To clarify the linear-on-geometry thing is primarily going to help HDDs, and files over 500MB or so. I've worked with 500GB TIFF files before that would benefit quite a bit on loading. Windows tries to do this but that relies on the drive being defragmented regularly and a proper cluster size that may not be optimal for other applications. To get around this mess, I use an SSD for boot, a 2TB drive for music / movie projects, another 2TB for RAW files from DSLRs, and a 4TB that stores huge amounts of source code that I could compress for a pretty big gain in space. – Guesty McGuesterson – 2018-01-10T00:30:48.903

1Ideally I'd like a second SSD just for "program files" to limit contention between system files and those, but windows tends to break attempts to set that up after a few weeks. I can make D:\Program Files or whatever, but too many programs are hardcoded to put crap on the system drive literally everywhere. – Guesty McGuesterson – 2018-01-10T00:33:41.323

-2

Microsoft Windows NTFS compression should not be used for anything other than log files or, generally speaking, text files or otherwise highly compressible files.

Consider this: historically I have seen performance of file compression stuck at 20-25 MiB/s. This is the normal speed for zipping a file with one 2.4-3.0Ghz processor. NTFS Compression is not multithreaded. This is a huge problem!

Consider decent speed for a hard drive nowadays is 100 MiB/s. If you don't get around 4-5x compression your are massively loosing performance both in reading and in writing. This is what happens.

Alessandro Zigliani

Posted 2012-04-12T17:12:10.953

Reputation: 1