Deleting list of files with non-standard file names

2

I have recently had to rescue files from a bad HDD, and am in the middle of cleaning up the mess. Currently I'm using a duplicate cleaner to delete any restored files that have duplicates on my backup. I'm comparing contents, not file names as file names are mostly generated by the rescue program.

Unfortunately, some of the files I need to delete are given file names that contain all sorts of characters like %, @, ; and other stuff that causes issues. My dupechecker gets stuck when trying to delete the files, showing no progress for hours trying to discover items to delete.

So I exported the list of files to delete and turned to Powershell to delete them. It deletes some, but soon fails when it encounters disruptive characters in file names.

My Powershell command:

Get-Content d:\dupelist3.txt | Remove-Item

An excerpt of the file list (added CR to improve readability):

G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files---[100%]--[All-files-CRC-OK]--[16-files]-

"G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files\FTP_SERVER.LOG;6"

"G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files\GOPHER.$5516417292;1"

"G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files\GOPHERRC.;1"

G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files\listener[1].htm

"G:\HE12 #2 recovery EaseUS\Recovered data 09-06 09_23_00\1 HE12 2 (F) NTFS\Other lost files\TELNET.LOG;1"

The file list was originally a csv file which I handled using Excel to extract only the path and filenames from the export of my dupechecker. Thus some file references were encapsulated in double quotes when exporting the (tab delimited) text file from Excel. I notice how the first item in the list over is represented wrongly in this post, as there should be a backslash after "Other lost files" and before the three following dashes. Guess that illustrates the issue somehow :)

So my question is; how can I delete all the files in my list given the complications at hand? Manual manipulation won't work as there are 100k+ files in the list, and I have multiple lists.

I'm open to using other tools as long as they get the job done...

Best regards,

Steinar

Fossie

Posted 2019-09-11T07:52:16.743

Reputation: 23

Does your duplicate removing tool can be set to use short (8.3) filenames? – Akina – 2019-09-11T07:59:34.577

Well, no. It uses whatever it thinks the filename is, or else it names them FILE32589 etc. It seems to me that most of the problematic file names are actually correct in the sense that those are the original file names. E.g. the "TELNET.LOG;1" is from ancient times at my university which was using VAX. – Fossie – 2019-09-11T08:24:37.487

If so you may try to rename files and replace problematic symbols with safe ones (underscore, for example - or remove them at all, and, maybe, the tail) using CMD/PS script. – Akina – 2019-09-11T08:30:46.473

You could try Get-Content d:\dupelist3.txt | % {Remove-Item -LiteralPath $_ -WhatIf} (remove the WhatIf to actually execute the removal) – Lieven Keersmaekers – 2019-09-11T08:43:34.710

Thanks for your suggestions. Lievens comment sparked me to look further into the Remove-Item documentation and the use of LitheralPath. I found an example of just what I was looking for, but with slightly different code; "Remove-Item -LiteralPath $_.Name" Do you know what the difference is between the two versions? I'm not overly familiar with PS, and no detailed explanation was available in the docs. – Fossie – 2019-09-11T08:55:12.730

$_.Name comes most likely from a Get-ChildItem. The object passed through the pipe is then a file object, containing properties like name. Your input comes from a Get-Content so you don't have a name property. – Lieven Keersmaekers – 2019-09-11T09:23:15.313

Great! Thanks again man! It seems your solution worked very well, would you mind posting it as an answer so I can accept it and upvote? :) – Fossie – 2019-09-11T09:48:05.063

Done, and you'r welcome ;) – Lieven Keersmaekers – 2019-09-11T09:54:52.960

Answers

1

You could try

Get-Content d:\dupelist3.txt | % {Remove-Item -LiteralPath $_ -WhatIf}

Remove the -WhatIf to actually execute

From Remove-Item

-LiteralPath Specifies a path to one or more locations. The value of LiteralPath is used exactly as it is typed. No characters are interpreted as wildcards. If the path includes escape characters, enclose it in single quotation marks. Single quotation marks tell PowerShell not to interpret any characters as escape sequences

Lieven Keersmaekers

Posted 2019-09-11T07:52:16.743

Reputation: 1 088

0

If you can attack it with wildcards, you might find this useful. Tested with some pretty odd characters that all seem to work. I added Set-Location to avoid running in the wrong folder. Remove -WhatIf when ready for prime time.

Set-Location 'D:\Trashcan' $junk = '*.@@@', '*;*', 'trash[0-9].log' $files = get-childitem * -include $junk -recurse $files | Remove-Item -WhatIf $files.count

I use a similar script to clean up a set of folders that routinely collects files I don't want.

Allen Jackson

Posted 2019-09-11T07:52:16.743

Reputation: 96