Sort packs of lines alphabetically

3

0

Note: the complementary question is here: How to separately sort lines within multiple “chunks” separated with headers?


So what I did find an answer to, is how to sort lines in a text files alphabetically. But, it is not exactly what I need to do. I have this file with profiles containing 15 different parameters that goes into an instrument at work and is read by the machine to have a list of profiles.

Sadly, the formatting of this file looks like this:

[ProfileB]
param1=z
param2=y
param3=x
[ProfileA]
param1=k
param2=l
param3=

And I want to sort the Profiles alphabetically, but I need them to stay grouped with their parameters. The above example should be sorted like this:

[ProfileA]
param1=k
param2=l
param3=
[ProfileB]
param1=z
param2=y
param3=x

I guess there is something to work either with the fixed number of lines (name+parameters) or with the character "[" as an identifier for the beginning of a group of lines.

But this is beyond my capacity in text manipulation. I have at my disposition either Sublime Text, R, or Linux command console.

Paul Giroud

Posted 2018-06-01T10:36:24.010

Reputation: 33

Does the "instrument" require the profiles to be sorted? – glenn jackman – 2018-06-01T12:08:08.663

2I hope somebody can help you but in the meantime you should learn an interpreted language, not as heavy as C, and not as fiddly as bash. Something like ruby or python or perl. – barlop – 2018-06-01T14:08:34.773

@glennjackman No but if the file is not sorted, the profile are loaded in the same order as the one in the file. – Paul Giroud – 2018-06-04T08:51:34.403

1@barlop I do have basics in python and perl, and I am willing to use them (which is what I meant by access to the command console) – Paul Giroud – 2018-06-04T08:53:37.573

Answers

3

This works in my Debian:

sed '1 ! s/^\[/\x00\[/g' | sort -z | tr -d "\0"

To work with file(s) use redirection(s), e.g. { sed … ; } <input.txt >output.txt, where sed … is the whole command.

The procedure is as follows:

  1. sed inserts a null character before every [ that is in the beginning of a line, unless the line is the first one. This way null characters separate profiles.
  2. sort -z uses these null characters as separators, so it sorts whole profiles, not separate lines.
  3. tr deletes null characters.

Kamil Maciorowski

Posted 2018-06-01T10:36:24.010

Reputation: 38 429

0

Here's a small Perl script that does the job:

my %profiles;
my $profile;

while (<>) {
    if (/\[(.+)\]/) {
        $profile = $1;
        next;
    }
    next if !defined $profile;

    chop if /\n$/;
    push @{ $profiles{$profile} }, $_;
}

foreach my $key (sort keys %profiles) {
    print "[$key]\n";
    foreach my $line (sort @{ $profiles{$profile} }) {
        print "$line\n";
    }
}

Save it into a file, sortProfiles.pl for example, and run:

perl sortProfiles.pl <inputFile.txt >outputFile.txt

How it works

  1. It reads the input file (while (<>)).
  2. For each profile, [profile] in input file, it remembers its name in $profile variable.
  3. It saves each line following the profile header inside an array.
  4. Then it sorts the keys of %profiles hash.
  5. It also sorts the lines inside the array.

In this script, %profiles is a hash. Its keys are profile names, its values are arrays of lines.
Thus @{ $profiles{$profile} } is the array that stores lines for profile name in $profile variable.

Alexey Ivanov

Posted 2018-06-01T10:36:24.010

Reputation: 3 900

I think this answers both this question and the complimentary one.

– Alexey Ivanov – 2018-06-07T22:06:16.493