16
1
I need remove duplicate lines from a text file, it is simple in Linux using
cat file.txt |sort | uniq
when file.txt contains
aaa
bbb
aaa
ccc
It will output
aaa
bbb
ccc
Is there a Windows equivalent? or how do this in a Windows way?
16
1
I need remove duplicate lines from a text file, it is simple in Linux using
cat file.txt |sort | uniq
when file.txt contains
aaa
bbb
aaa
ccc
It will output
aaa
bbb
ccc
Is there a Windows equivalent? or how do this in a Windows way?
31
The Sort-Object cmdlet in PowerShell supports a -Unique switch that does the same thing as uniq:
Get-Content file.txt | Sort-Object -unique
Of course, owing to presence of aliases in PowerShell, you can also write:
type file.txt | sort -unique
Additionally, there is an undocumented /unique switch in sort.exe of Windows 10, so, this should work in Command Prompt:
type file.txt | sort /unique
1I don't think the Windows command (sort.exe) supports this; it looks like a feature of the PowerShell builtin. – Ben Voigt – 2018-04-23T04:11:03.477
1type unsorted.txt | sort -unique > sorted.txt This really work under win10 and writed unique values to new file – Lixas – 2018-04-23T05:52:13.837
7@BenVoigt surprisingly, type file.txt | sort /unique works with undocumented switch /unique of sort.exe utility (at least on Windows 10). On the other side, you are right that provided example is PowerShell Get-Content file.txt | Sort-Object -unique, in fact. – JosefZ – 2018-04-23T05:57:56.087
@Lixas type unsorted.txt | sort -unique returns -uniqueThe system cannot find the file specified with errorlevel 1 if run from an open cmd prompt under Windows 10! – JosefZ – 2018-04-23T06:02:04.167
1sort /unique errors with Invalid switch. on Windows 7 Enterprise. – Don Cruickshank – 2018-04-23T12:00:31.923
1
@JosefZ , the answer specifies the switch using "/" (forward-slash) and not dash; the forward-slash is Windows standard for commands in CMD, and not all commands allow substituting a dash for a slash on command switches. https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/windows-commands for a quick reference consistently shows slashes. The above was a great answer, sharing a tidbit not commonly known, though I can't imagine why the "/unique" switch is undocumented since it's so useful.
– Debra – 2019-01-07T14:23:14.350@Debra sure. Microsoft said in their archived Windows NT Command Shell article: command switches always begin with a slash / character… Occasionally, switches begin with a + or - character.
Ummm, yes, thank you Microsoft for stating "always", well, except when not so. – Debra – 2019-01-08T15:36:58.880
well what it file it's for example over 1 gb ? – Cornea Valentin – 2020-01-23T23:44:13.123
6
There's ports of uniq that work identically to the gnu/coreutils versions. I personally use the variation from GOW but git for windows has a significantly newer version. No cygwin required though for the latter you need to look in /usr/bin
Since these packages also contain cat, sort and uniq - your workflow should be mostly identical, and cat file.txt |sort | uniq should work mostly identically
2
You can easily write the command "uniq" by yourself. Save this in a batch file "uniq.cmd" somewhere in your %path% can find it (e.g. in %windir%\system32). This version is NOT case sensitive:
@echo off
setlocal DisableDelayedExpansion
set "prev="
for /f "delims=" %%F in ('sort %*') do (
rem "set" needs to be done without delayed expansion
set "line=%%F"
setlocal EnableDelayedExpansion
set "line=!line:<=<!"
if /i "!prev!" neq "!line!" echo(!line!
set "prev=!line!"
endlocal
)
This works with "uniq mytextfile" as well as "cat mytextfile | uniq"; as all input and arguments are simply passed to the sort command.
Starting with Windows 7, you may want a really case sensitive version (the difference ist undocumented switch "sort /C" and no "if /i"):
@echo off
setlocal DisableDelayedExpansion
set "prev="
for /f "delims=" %%F in ('sort /C %*') do (
rem "set" needs to be done without delayed expansion
set "line=%%F"
setlocal EnableDelayedExpansion
set "line=!line:<=<!"
if "!prev!" neq "!line!" echo(!line!
set "prev=!line!"
endlocal
)
Nice, but it has some flaws. It currently fails with content like /?, ON, one ^ caret or bang!. But that can be solved by using the toggling delayed expansion technic and echo( see: Dostips: ECHO. FAILS to give text or blank line
Thanks, the reason for using the toggling delayed expansion technic had not been obvious nor marked. I edited my examples to be (almost) perfect now. – Tom Stein – 2019-01-17T15:48:02.583
0
Addition to Yu Jiaao's answer. You can invoke the sort-object powershell cmdlet in a command prompt like:
type file.txt | powershell -nop "$input | sort -unique"
10On Unix, you could write it as
sort -u file.txt– jfs – 2018-04-23T06:24:02.1231There is also WSL which works pretty well as far as this sort of stuff goes – user2813274 – 2018-04-23T13:07:28.960
Maybe you want to set something as solution, if you don't have any further questions? – davidbaumann – 2018-05-09T14:28:56.170