Windows CLI: sort text file by column

1

I'd like to sort a text file by the 3rd column of data, not the 3rd character, using pure Windows CLI commands. The commands would be put into a batch file.

The data would look something like

#filename   seqno   phasename   a       b       c       scale   Rwp
blah_001.xye    1   corundum    3       3       12      0.001   3
blah_001.xye    1   silicon     5.4     5.4     5.4     0.002   3
blah_002.xye    2   corundum    3.1     3.1     12.1    0.002   3.5
blah_002.xye    2   silicon     5.41    5.41    5.41    0.004   3.5
blah_003.xye    3   corundum    3.2     3.2     12.2    0.001   3.1
blah_003.xye    3   silicon     5.42    5.42    5.42    0.002   3.1
blah_004.xye    4   corundum    3.4     3.4     12.2    0.001   3.2
blah_004.xye    4   silicon     5.43    5.43    5.43    0.002   3.2
blah_005.xye    5   corundum    3.5     3.5     12.3    0.001   3.9
blah_005.xye    5   silicon     5.44    5.44    5.44    0.002   3.9
blah_006.xye    6   corundum    3.6     3.6     12.3    0.001   3.1
blah_006.xye    6   silicon     5.45    5.45    5.45    0.002   3.1

I want to sort by phasename. If would be good if I could keep the first row in place. There will be more than 5 columns after the 3rd, and therer will be more rows than this, but the sorting would always be by the 3rd column.

I've seen this question, but I can't decypher the for commands.

masher

Posted 2018-08-01T13:21:10.730

Reputation: 183

I commend to your attention the extensive information at SS64 on the FOR command, or Microsoft's own documentation on the FOR command, or even the output of HELP FOR.

– Jeff Zeitlin – 2018-08-01T13:34:08.817

Answers

2

Your file uses fixed width format, so you always want to begin sorting at position 21.

The SORT command provides an option to perform each line comparison starting at a particular position in the line. The only shortcoming is it does not provide a mechanism to keep the header record at the top.

So the trick is to use SET /P to read the first line and write it to a new file. Then use MORE +1 to read all but the 1st line, and pipe that to SORT \+21 and append the result to the new file. The last thing to do is overwrite the original file with the new file with MOVE.

@echo off
setlocal enableDelayedExpansion
set "file=test.txt"
set /p "ln=" < "%file%"
>"%file%.new" (
  echo(!ln!
  more +1 "%file%" | sort /+21
)
move /y "%file%.new" "%file%" >nul

There are limitations to this strategy:

  • MORE will hang if the file exceeds 64k lines
  • MORE converts tabs into spaces
  • The header line must not exceed 1021 characters due to SET /P limitations.

Alternatively you can use my JSORT.BAT utility to easily and efficiently perform the sort with one simple command. The MOVE is still required.

@echo off
call test.txt /p 21 /s 1 /o test.txt.new
move /y test.txt.new test.txt >nul

JSORT is pure script (hybrid JScript/batch) that runs natively on any Windows machine from XP onward. No 3rd party exe file is needed.

Note that JSORT defaults to a case sensitive search, unlike the built in SORT command. Not an issue with your sample text, but if you want to ignore case, then simply append the /I option.

JSORT has a size limitation due to the fact that the entire file must fit into a single string variable in memory. I'm not sure about the exact value, but I believe the maximum supported file size is a bit less than 1GB.

Read the built in help to learn about some other nifty features that JSORT supports.

  • Sort embedded numbers as numbers instead of text
  • Sort based on a delimited token instead of a fixed position
  • Preserve only unique lines
  • Limit the number of output lines

dbenham

Posted 2018-08-01T13:21:10.730

Reputation: 9 212