remove duplicate lines after first word in sentace linux terminal

Try this one-liner:

cut -d " " -f 1 file_in.txt | uniq > file_out.txt

First you use the cut command with the delimiter -d " ", and then get the first field with -f 1 from the file file_in.txt. Then pipe the result into the uniq command, and that will remove duplicates from the list. Finally you redirect the result to file_out.txt.

Adam

Posted 2014-09-19T22:42:18.840

Reputation: 1 510

will this work on large files gb's in size?, thanks for the reply also – mark – 2014-09-19T23:40:46.617

For very large files it might be worth doing in 2 steps, so cut -d " " -f 1 file_in.txt > file_tmp.txt and then uniq file_tmp.txt > file_out.txt. That will help narrow down the issue if something fails. I don't know of any file size restrictions for either cut or uniq, so the only real way to find out would be to test it. Running the commands is non-destructive though, so giving it a shot won't hurt. – Adam – 2014-09-21T16:08:14.950

remove duplicate lines after first word in sentace linux terminal

Answers