14

I'm trying to find differences in the content of two folder structures using Windows Powershell. I have used the following method to ensure that the file names are the same, but this method does not tell me if the contents of the files are the same:

$firstFolder = Get-ChildItem -Recurse folder1
$secondFolder = Get-ChildItem -Recurse folder2
Compare-Object -ReferenceObject $firstFolder -DifferenceObject $secondFolder

The technique described in this ServerFault question works for diffing a single file, but these folders contain hundreds of files at a variety of depths.

The solution does not necessarily need to tell me what specifically in the files is different - just that they are. I am not interested in differences in metadata such as date, which I already know to be different.

David Smith
  • 249
  • 1
  • 2
  • 7

5 Answers5

16

If you want to wrap the compare into a loop I would take the following approach:

$folder1 = "C:\Users\jscott"
$folder2 = "C:\Users\public"

# Get all files under $folder1, filter out directories
$firstFolder = Get-ChildItem -Recurse $folder1 | Where-Object { -not $_.PsIsContainer }

$firstFolder | ForEach-Object {

    # Check if the file, from $folder1, exists with the same path under $folder2
    If ( Test-Path ( $_.FullName.Replace($folder1, $folder2) ) ) {

        # Compare the contents of the two files...
        If ( Compare-Object (Get-Content $_.FullName) (Get-Content $_.FullName.Replace($folder1, $folder2) ) ) {

            # List the paths of the files containing diffs
            $_.FullName
            $_.FullName.Replace($folder1, $folder2)

        }
    }   
}

Note that this will ignore files which do not exist in both $folder1 and $folder2.

jscott
  • 24,204
  • 8
  • 77
  • 99
5

I have taken jscott's answer an expanded it to output the files that are present in one but not the other for those who are insterest in that type of functionality. Please note it also shows progress made since it was hard for me to see that given the huge folders with not very many differences. It looked like the script was hung to me. Here is the powershell code for that:

$folder1 = "C:\Folder1"
$folder2 = "C:\Folder2"

# Get all files under $folder1, filter out directories
$firstFolder = Get-ChildItem -Recurse $folder1 | Where-Object { -not $_.PsIsContainer }

$failedCount = 0
$i = 0
$totalCount = $firstFolder.Count
$firstFolder | ForEach-Object {
    $i = $i + 1
    Write-Progress -Activity "Searching Files" -status "Searching File  $i of     $totalCount" -percentComplete ($i / $firstFolder.Count * 100)
    # Check if the file, from $folder1, exists with the same path under $folder2
    If ( Test-Path ( $_.FullName.Replace($folder1, $folder2) ) ) {
        # Compare the contents of the two files...
        If ( Compare-Object (Get-Content $_.FullName) (Get-Content $_.FullName.Replace($folder1, $folder2) ) ) {
            # List the paths of the files containing diffs
            $fileSuffix = $_.FullName.TrimStart($folder1)
            $failedCount = $failedCount + 1
            Write-Host "$fileSuffix is on each server, but does not match"
        }
    }
    else
    {
        $fileSuffix = $_.FullName.TrimStart($folder1)
        $failedCount = $failedCount + 1
        Write-Host "$fileSuffix is only in folder 1"
    }
}

$secondFolder = Get-ChildItem -Recurse $folder2 | Where-Object { -not $_.PsIsContainer }

$i = 0
$totalCount = $secondFolder.Count
$secondFolder | ForEach-Object {
    $i = $i + 1
    Write-Progress -Activity "Searching for files only on second folder" -status "Searching File  $i of $totalCount" -percentComplete ($i / $secondFolder.Count * 100)
    # Check if the file, from $folder2, exists with the same path under $folder1
    If (!(Test-Path($_.FullName.Replace($folder2, $folder1))))
    {
        $fileSuffix = $_.FullName.TrimStart($folder2)
        $failedCount = $failedCount + 1
        Write-Host "$fileSuffix is only in folder 2"
    }
}
Mark Henderson
  • 68,316
  • 31
  • 175
  • 255
helios456
  • 151
  • 1
  • 2
  • Why would you extract the actual contents of a file when calling Compare-Object? Thats what Compare-Object does. https://technet.microsoft.com/en-us/library/ee156812.aspx?f=255&MSPPError=-2147217396 – Casper Leon Nielsen Dec 08 '15 at 14:51
  • I know it's an older post but ran into this while searching for an answer. This is a great answer but the folders I'm comparing are huge and one of them is onedrive. If I put the get-childitem in a variable not only it takes an insane amount of time but it also tries to download all the files from the Onedrive. is there a better way to compare two folders on the fly? I tried piping but no luck. – Besiktas Sep 10 '20 at 22:12
1

You just wrap a loop around the correct answer from your linked question that already answered this, and walk the directory tree comparing every file with the same name.

/Edit : If that's actually your question, it's more appropriate for SO, where you seem to be a regular contributor. You're asking a programming question. I understand you're doing it for a sysadmin-type of purpose, in which case, I would tell you to use a purpose-built tool like WinDiff.

mfinni
  • 35,711
  • 3
  • 50
  • 86
  • Can you please demonstrate how? If I knew how to loop through the files in the linked question, I wouldn't have asked this. – David Smith Aug 19 '13 at 16:43
  • 2
    OK, this site is not appropriate for "Give me the codez" type questions. If you need to get started on learning how to do loops in Powershell, buy a book or find online tutorials; there are many. – mfinni Aug 19 '13 at 16:49
  • 3
    I think you woke up on the wrong side of the bed. I am not a regular Powershell user. I have demonstrated in my question both a technique that I am currently attempting, and a link to a question that has additional helpful information for my problem. I do not know how to combine the two techniques, which is my problem, and the reason I have asked this question. It is also a question which I have been unable to answer using Google searches. If you aren't going to be helpful, please consider deleting your answer. – David Smith Aug 19 '13 at 16:53
  • 2
    @BigDave [Literally *the first result* for `PowerShell loop through files` on Google](http://stackoverflow.com/questions/1523043/how-to-loop-through-files-and-rename-using-powershell). c'mon now - a *little* effort on your part? – voretaq7 Aug 19 '13 at 17:01
  • 3
    @voretaq7 I think you guys misunderstand me as a PowerShell user. I did that google search, attempted that technique, and did not succeed. I tried to explain where I'm at in the question above. The question you link to works on a single set of files. I have two sets that I need to compare, name-to-name. I'm really not trying to be lazy here. I know how to loop, and I know how to compare. How do I loop, and compare two sets? – David Smith Aug 19 '13 at 17:13
  • @mfinni Sorry - I'll close and move to SO. – David Smith Aug 19 '13 at 17:17
  • 1
    I think that doing this with PS is a great little project. However, WinMerge (thought it was WinDiff, silly me) is really a great tool if you're going to be doing this very often. It's literally built for the job. Try it, the download is free. It's got decomposers for most file types, and does a great job of highlighting, including options for how you want to handle whitespace. – mfinni Aug 19 '13 at 17:26
1

Do this:

compare (Get-ChildItem D:\MyFolder\NewFolder) (Get-ChildItem \\RemoteServer\MyFolder\NewFolder)

And even recursively:

compare (Get-ChildItem -r D:\MyFolder\NewFolder) (Get-ChildItem -r \\RemoteServer\MyFolder\NewFolder)

and is even hard to forget :)

0

The following function recursively checks multiple folders (though two at a time) for deletions (only in the former), additions (only in the latter), AND changes (where files sharing a name have different content)

'folder1','folder2' | DiffFolders

The function:

Function DiffFolders {
    Begin {
        $last = $NULL
    }
    Process {
        $current = @{}
        $unchanged = 0
        
        $parent = $_
        $parentPath = (Get-Item -Path $parent).FullName
        $parentRegex = "^$([regex]::escape($parentPath))"
        Get-ChildItem -Path $parentPath -Recurse -File `
        | %{
            $name = $_.FullName -replace $parentRegex,''
            $current.Add($name, (Get-FileHash -LiteralPath $_.FullName).Hash)
            
            if (!$last) {
                return
            }
            
            if (!$last.Contains($name)) {
                [PSCustomObject]@{
                    parent = $parent
                    event = 'Added'
                    value = $name
                }
                return
            }
            
            if ($last[$name] -eq $current[$name]) {
                ++$unchanged
            }
            else {
                [PSCustomObject]@{
                    parent = $parent
                    event = 'Changed'
                    value = $name
                }
            }
            $last.Remove($name)
        }
        
        if ($last) {
            [PSCustomObject]@{
                parent = $parent
                event = 'Unchanged'
                value = $unchanged
            }
            $last.Keys `
            | %{
                [PSCustomObject]@{
                    parent = $parent
                    event = 'Deleted'
                    value = $_
                }
            }
        }
        
        $last = $current
    }
}

Here's a neat demo that should be on most win10 machines:

PS C:\Program Files\WindowsApps> gci 'Microsoft.NET.Native.Runtime.*_x64__8wekyb3d8bbwe' | %{ $_.Name }
Microsoft.NET.Native.Runtime.1.7_1.7.25531.0_x64__8wekyb3d8bbwe
Microsoft.NET.Native.Runtime.1.7_1.7.27422.0_x64__8wekyb3d8bbwe
Microsoft.NET.Native.Runtime.2.1_2.1.26424.0_x64__8wekyb3d8bbwe
Microsoft.NET.Native.Runtime.2.2_2.2.27011.0_x64__8wekyb3d8bbwe
Microsoft.NET.Native.Runtime.2.2_2.2.28604.0_x64__8wekyb3d8bbwe
PS C:\Program Files\WindowsApps> gci 'Microsoft.NET.Native.Runtime.*_x64__8wekyb3d8bbwe' | %{ $_.Name } | DiffFolders | Out-GridView

We can see exactly at which versions files were added and removed from the .NET runtime, and which were changed.
Unchanged files aren't mentioned, but counted for brevity (usually you'll have way more unchanged than changed files I imagine).

enter image description here

Also works on linux, for you powershell users out there running it there :)

For those curious, the unchanged files were clrcompression.dll,logo.png, logo.png, logo.png, and logo.png

Hashbrown
  • 225
  • 2
  • 3