Extract, sort and save data from multiple files into one

0

I have multiple .sie files,

The SIE format is an open standard for transferring accounting data between different software produced by different software suppliers. of which I want to extract some information and create one big ordered file that is sorted by date. I am trying to add a tab between the columns.

Here is an example of the files content:

#FLAGGA 0
#PROGRAM "ISUPOS Kassa 3" 3.1.129
#FORMAT PC8
#GEN 20180119
#SIETYP 4
#FNAMN "Café"
#VER "" "1" 20180113 "Z-Dagrapport #1, Kassa #1 2018-01-13"
{
   #TRANS 1910 {} 819.00
   #TRANS 1920 {} 1334.00
   #TRANS 1930 {} 438.00
   #TRANS 2620 {} -277.61
   #TRANS 3052 {} -2313.39
}

This is from one file. Some files may or may not have all the #TRANS rows in it.

I would like the "big file" to have one row for each file e.g

2018-01-13    819     1334.00    438.00    -277.61    -2313.39

if one of the #TRANS records is missing, there should be a 0 (zero) instead in its place.

I have tried with sed and awk but I can't accomplish what I want. The next step is to import the big file to excel.

zaonline

Posted 2018-05-09T20:50:28.480

Reputation: 358

Please note that https://superuser.com is not a free script/code writing service. If you tell us what you have tried so far (include the scripts/code you are already using) and where you are stuck then we can try to help with specific problems. You should also read How do I ask a good question?.

– DavidPostill – 2018-05-09T20:51:27.650

What does "if one of the trans records is missing" mean? Is it only those 5 "keys" (1910,1920,1930,2620,3052)? – glenn jackman – 2018-05-09T21:23:07.287

@DavidPostill I am stuck with everything. I've managed to extract those fields I'm interested in from one file. But I can't manage to get the data from one file on one row and next file on the second row sorted and ordered by date. – zaonline – 2018-05-09T22:51:35.903

@glennjackman Some of the files have lesser records. e.g 1920, 1930, 2620, 3052 but not 1910. In those cases it should be a zero instead of the missing record. – zaonline – 2018-05-09T22:51:40.730

@DavidPostill is there a more suitable site for this kind of questions that you know of? – zaonline – 2018-05-09T23:10:36.103

Answers

1

This simple awk program will search in input file particular #TRANS numbers and put them into table. If one ore more #TRANS absent, then in will be zero.

awk '
BEGIN{ for(i=1;i<=6;i++){ o[i]=0 } }
/#VER/ { date=$4; o[1]=substr(date,0,4)"-"substr(date,5,2)"-"substr(date,7,2); }
/#TRANS 1910/{ o[2]=$4 }
/#TRANS 1920/{ o[3]=$4 }
/#TRANS 1930/{ o[4]=$4 } 
/#TRANS 2620/{ o[5]=$4 }
/#TRANS 3052/{ o[6]=$4 }
END{ for(i=1;i<=6;i++){ out=out o[i] " "; } print out }
' file

For many files you may use such code:

echo "DATE TRANS1910 TRANS1920 TRANS1930 TRANS2620 TRANS3052" >result.txt
for file in $(ls -1 *.sie); do  
    awk '
      BEGIN{ for(i=1;i<=6;i++){ o[i]=0 } }
      /#VER/ { date=$4; o[1]=substr(date,0,4)"-"substr(date,5,2)"-"substr(date,7,2); }
      /#TRANS 1910/{ o[2]=$4 }
      /#TRANS 1920/{ o[3]=$4 }
      /#TRANS 1930/{ o[4]=$4 } 
      /#TRANS 2620/{ o[5]=$4 }
      /#TRANS 3052/{ o[6]=$4 }
      END{ for(i=1;i<=6;i++){ out=out o[i] " "; } print out }
    ' ${file} >> result.txt
done

The result will be:

cat result.txt 
DATE TRANS1910 TRANS1920 TRANS1930 TRANS2620 TRANS3052
2018-01-13 819.00 1334.00 438.00 -277.61 -2313.39

Sasha Golikov

Posted 2018-05-09T20:50:28.480

Reputation: 178