Is there a Windows equivalent to the Unix uniq?

16

1

I need remove duplicate lines from a text file, it is simple in Linux using

cat file.txt |sort | uniq

when file.txt contains

aaa
bbb
aaa
ccc

It will output

aaa
bbb
ccc

Is there a Windows equivalent? or how do this in a Windows way?

Yu Jiaao

Posted 2018-04-23T00:35:35.233

Reputation: 543

10On Unix, you could write it as sort -u file.txt – jfs – 2018-04-23T06:24:02.123

1There is also WSL which works pretty well as far as this sort of stuff goes – user2813274 – 2018-04-23T13:07:28.960

Maybe you want to set something as solution, if you don't have any further questions? – davidbaumann – 2018-05-09T14:28:56.170

Answers

31

The Sort-Object cmdlet in PowerShell supports a -Unique switch that does the same thing as uniq:

Get-Content file.txt | Sort-Object -unique

Of course, owing to presence of aliases in PowerShell, you can also write:

type file.txt | sort -unique

Additionally, there is an undocumented /unique switch in sort.exe of Windows 10, so, this should work in Command Prompt:

type file.txt | sort /unique

Yu Jiaao

Posted 2018-04-23T00:35:35.233

Reputation: 543

1I don't think the Windows command (sort.exe) supports this; it looks like a feature of the PowerShell builtin. – Ben Voigt – 2018-04-23T04:11:03.477

1type unsorted.txt | sort -unique > sorted.txt This really work under win10 and writed unique values to new file – Lixas – 2018-04-23T05:52:13.837

7@BenVoigt surprisingly, type file.txt | sort /unique works with undocumented switch /unique of sort.exe utility (at least on Windows 10). On the other side, you are right that provided example is PowerShell Get-Content file.txt | Sort-Object -unique, in fact. – JosefZ – 2018-04-23T05:57:56.087

@Lixas type unsorted.txt | sort -unique returns -uniqueThe system cannot find the file specified with errorlevel 1 if run from an open cmd prompt under Windows 10! – JosefZ – 2018-04-23T06:02:04.167

1sort /unique errors with Invalid switch. on Windows 7 Enterprise. – Don Cruickshank – 2018-04-23T12:00:31.923

1

@JosefZ , the answer specifies the switch using "/" (forward-slash) and not dash; the forward-slash is Windows standard for commands in CMD, and not all commands allow substituting a dash for a slash on command switches. https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/windows-commands for a quick reference consistently shows slashes. The above was a great answer, sharing a tidbit not commonly known, though I can't imagine why the "/unique" switch is undocumented since it's so useful.

– Debra – 2019-01-07T14:23:14.350

@Debra sure. Microsoft said in their archived Windows NT Command Shell article: command switches always begin with a slash / character… Occasionally, switches begin with a + or - character.

– JosefZ – 2019-01-07T18:51:40.320

Ummm, yes, thank you Microsoft for stating "always", well, except when not so. – Debra – 2019-01-08T15:36:58.880

well what it file it's for example over 1 gb ? – Cornea Valentin – 2020-01-23T23:44:13.123

6

There's ports of uniq that work identically to the gnu/coreutils versions. I personally use the variation from GOW but git for windows has a significantly newer version. No cygwin required though for the latter you need to look in /usr/bin

Since these packages also contain cat, sort and uniq - your workflow should be mostly identical, and cat file.txt |sort | uniq should work mostly identically

Journeyman Geek

Posted 2018-04-23T00:35:35.233

Reputation: 119 122

2

You can easily write the command "uniq" by yourself. Save this in a batch file "uniq.cmd" somewhere in your %path% can find it (e.g. in %windir%\system32). This version is NOT case sensitive:

@echo off
setlocal DisableDelayedExpansion
set "prev="
for /f "delims=" %%F in ('sort %*') do (
    rem "set" needs to be done without delayed expansion
    set "line=%%F"
    setlocal EnableDelayedExpansion
        set "line=!line:<=<!"
        if /i "!prev!" neq "!line!" echo(!line!
        set "prev=!line!"
    endlocal
)

This works with "uniq mytextfile" as well as "cat mytextfile | uniq"; as all input and arguments are simply passed to the sort command.

Starting with Windows 7, you may want a really case sensitive version (the difference ist undocumented switch "sort /C" and no "if /i"):

@echo off
setlocal DisableDelayedExpansion
set "prev="
for /f "delims=" %%F in ('sort /C %*') do (
    rem "set" needs to be done without delayed expansion
    set "line=%%F"
    setlocal EnableDelayedExpansion
        set "line=!line:<=<!"
        if "!prev!" neq "!line!" echo(!line!
        set "prev=!line!"
    endlocal
)

Tom Stein

Posted 2018-04-23T00:35:35.233

Reputation: 29

Nice, but it has some flaws. It currently fails with content like /?, ON, one ^ caret or bang!. But that can be solved by using the toggling delayed expansion technic and echo( see: Dostips: ECHO. FAILS to give text or blank line

– jeb – 2019-01-14T10:13:43.303

Thanks, the reason for using the toggling delayed expansion technic had not been obvious nor marked. I edited my examples to be (almost) perfect now. – Tom Stein – 2019-01-17T15:48:02.583

0

Addition to Yu Jiaao's answer. You can invoke the sort-object powershell cmdlet in a command prompt like:

type file.txt | powershell -nop "$input | sort -unique"

snipsnipsnip

Posted 2018-04-23T00:35:35.233

Reputation: 101