The server hosts a service called MultiCash-Datenbank. For each user it keeps two cache files (SPASD32.SRC and SPASD32Z.SRC), which grow in size by ~1MB/day. There's also a bunch of small data files added each day. I have been observing our networked backups for three months, and noticed that the vhdx image of the partition holding this data keeps growing in size, by 300-900MB/day. On a 1TB partition, the 7GB of data eventually ballooned into a 30GB vhdx file and I had to take action.
Chronological order of temporary solutions that I've discovered before having the idea to run DiskView:
- recreate the partition (moving the files back and forth consolidates them)
- shrink the partition (performs a free space consolidation step)
- cap the partition size to 10GB (caps the image size to 10GB)
- run manual defragmentation (default scheduled defrag does nothing on 2012r2!)
So. For some unknown reason, the clusters of these files are laid out on disk in a very unusual way:
Each 4k cluster is separated from the others by around 256 clusters (1MB) of free space. Also, the files are interleaved most of the time. This pattern continues until it covers all available free space. Then, as the files grow further, groups of multiple clusters become more frequent.
No idea if this fragmentation is caused by the pattern of writes of the service itself, or some ntfs optimization mechanism. Fsutil reports that the files are not flagged as sparse. Contig reports that on this 10GB partition that holds 7GB of data, there are around 3000 such fragments (= spannning 3GB of space). This would make sense if the disk imaging process allocated a 1MB block whenever data was present. I have read that the vhdx format contains performance optimizations, so this could be one of them. Then it would unfortunately lead to this worst-case scenario.
I'm also open to the possibility that I'm completely wrong and my observations are unrelated to the actual cause. One warning sign is that the inflated backups do not compress to the same size as the optimized backups - for a 100% inflation in size there's 25% extra of compressed data.
So in the end, I'm left with a partial understanding of the situation, and some ugly workarounds. I would like to ask: What's causing that fragmentation, and how to make it stop? Is Windows Server Backup's vhdx format really using 1MB blocks, and if so, can that be changed?