3

I have several thousand files of different revisions in a folder.

ACZ002-0.p
ACZ002-1.p
ACZ002-2.p
ACZ051-0.p
ACZ051-1.p
...

The revision is always the last digit before the dot. I need to preserve only the files with the latest revision, but I don't know how to proceed with my code.

$path = "E:\Export\"
$filter = [regex] "[A-Z]{3}[0-9]{3}\-[0-9]{1,2}\.(p)"
$files = Get-ChildItem -Path $path -Recurse | Where-Object {($_.Name -match $filter)}
HopelessN00b
  • 53,385
  • 32
  • 133
  • 208
Ales Novy
  • 31
  • 2

1 Answers1

1

There is probably a better way to do this, but here is what I first thought of to accomplish this; this approach uses named capture groups in the regex to make it easier for powershell to sort and group by the base filename and the revision. The final variable $filesToKeep will be an array of FileInfo objects that you can exclude from your delete command. Of course I recommend lots of testing before actually deleting anything.

$filter = [regex] "(?<baseName>[A-Za-z]{3}[0-9]{3})\-(?<revision>[0-9]+)\.p"
$results = ls c:\temp -Recurse | where {$_.Name -match $filter} | foreach-object {
    new-object PSObject -Property @{
        BaseName = $matches.BaseName
        Revision = $matches.Revision
        File = $_
    }
}
$filesToKeep = $results | sort basename, revision  -Descending | group basename | ForEach-Object { $_.group | select -first 1 -ExpandProperty File}
jbsmith
  • 1,291
  • 7
  • 13
  • The PowerShell pipeline is a huge performance bottleneck. For large datasets, I'd suggest iterating over the `$results` variable with `foreach` and using an `if` statement for comparison. – Trevor Sullivan Nov 08 '12 at 19:11