4

I'm trying to understand how Measure-DedupFileMetadata works so I can recursively go through some folders to report on how much space is actually used. I don't know how to interpret the output.

If I understand the documentation correctly, DedupDistinctSize should tell me how much space is freed if I delete those files (after garbage collection). These numbers appear accurate for most of my folders. But on folders where no deduplication has taken place yet, it shows 0.

I'm also not sure how to understand the SizeOnDisk info.

Here is the output for two folders:

Path                    : {E:\veeam\folder1}
Volume                  : E:
VolumeId                : \\?\Volume{77da8d6d-1416-4d2a-8c85-75c91f980398}
FilesCount              : 19
OptimizedFilesCount     : 3
Size                    : 2.38 TB
SizeOnDisk              : 1.81 TB
DedupSize               : 491.38 GB
DedupChunkCount         : 6786488
DedupDistinctSize       : 475.59 GB
DedupDistinctChunkCount : 6561011

Path                    : {E:\veeam\folder2}
Volume                  : E:
VolumeId                : \\?\Volume{77da8d6d-1416-4d2a-8c85-75c91f980398}
FilesCount              : 18
OptimizedFilesCount     : 0
Size                    : 332.7 GB
SizeOnDisk              : 332.7 GB
DedupSize               : 0 B
DedupChunkCount         : 0
DedupDistinctSize       : 0 B
DedupDistinctChunkCount : 0
Dan Buhler
  • 456
  • 4
  • 9

1 Answers1

2

I've done some tests by deduping various types of data and my conclusion is that the actual used space is SizeOnDisk + DedupDistinctSize.

So to make it look pretty and show it in GB use a calculated property:

Measure-DedupFileMetadata -Path e:\folder1 | Select Path, @{label="TotalGB"; expression={[math]::Round(($_.SizeOnDisk + $_.DedupDistinctSize) / 1GB, 0)}}

And here's how to script it and create a sorted table in a text file:

$Folders = @()
foreach ($folder in (Get-ChildItem -Path E:\ -Directory))
{
    Write-Host -NoNewline "Calcuating $($folder.FullName): "
    $Result = Measure-DedupFileMetadata -ErrorAction Continue -Path $folder.FullName
    $Folders += $Result
    Write-Host $Result.DedupDistinctSize
}

$Folders | Select {$_.Path[0]}, @{label='DedupDistinctSizeGB'; expression={[math]::Round($_.DedupDistinctSize / 1GB, 0)}}, @{label='SizeOnDiskGB'; expression={[math]::Round($_.SizeOnDisk / 1GB, 0)}}, @{label="TotalGB"; expression={[math]::Round(($_.SizeOnDisk + $_.DedupDistinctSize) / 1GB, 0)}} | Sort TotalGB -Descending | Format-Table -AutoSize | Out-File -FilePath 'Dedup_Summary.txt' -Append

The output looks like:

$_.Path[0]                                       DedupDistinctSizeGB SizeonDiskGB TotalGB
----------                                       ------------------- ------------ -------
E:\veeam\xxxxxxx                                                3868         2178    6045
E:\veeam\xxxxx                                                   840         3712    4553
E:\veeam\xxx                                                     801         3244    4044
E:\veeam\xxxxxxxxxx                                              683         1213    1896
E:\veeam\xxxxxxxxxxxxxx                                           41         1636    1678
E:\StorageCraft\xxxxxxx                                         1537           56    1593

I'm not sure why this command is so slow, but it took more than a week to run for me on a 50TB volume.

Dan Buhler
  • 456
  • 4
  • 9