linux + generate new file with specific structure from text file what the best option

Question

in my Linux machine I have the file orig-file.txt

this file include now 4 fields but they could be less or more ( this file generate by other application )

I need advice - what the best option to translate the orig-file.txt to file as output-file.txt file ( could be by shell script or awk ..etc)

the target to translate the orig-file.txt file to output-file.txt ( as my example below ) while need to remember that in the orig-file.txt number of fields can change to more or less

What the best option to do that? ( I will happy to get real example )

orig-file.txt

CREATE_TIMESTAMP              TELEPHONE_NUMBER             ID TYPE
-------------------           -------------------- ---------- -----------------
24-09-2009 16:17:45           33633333333                  20 other_mmm_phone
24-09-2009 17:45:07           33644444444                  20 other_mmm_phone
07-10-2009 10:45:49           12312312312                  20 legacyphone
07-10-2009 11:46:38           59320000043                  20 other_mmm_phone

output-file.txt

CREATE_TIMESTAMP -> 24-09-2009 16:17:45
TELEPHONE_NUMBER -> 33633333333
ID               -> 20
TYPE             -> other_mmm_phone



---



CREATE_TIMESTAMP -> 24-09-2009 16:17:45
TELEPHONE_NUMBER -> 33633333333
ID               -> 20
TYPE             -> other_mmm_phone

---

solution by AWK but not work -:(

     awk 'FNR == 1 {

        for (i = 1; i <= NF; i++) {
            header[i] = $i
        }
     FNR > 2 {
        for (i = 1; i<= NF; i++) {
            print header[i], "->", $i
        }
        printf "\n\n\n%s\n\n\n", "--------"
     }'    output.csv

 awk: syntax error near line 5
 awk: illegal statement near line 5

Your example for `output-file.txt` is *not* a CSV file (which means Comma separated value, but is often done with a semicolon or tab stops). The `orig-file` is much more like a CSV file. — Sven, Apr 17 '12 at 14:32
wouldn't it be easier to do it through database? or have perl parse it with whatever output you need — alexus, Apr 17 '12 at 20:13
awk script is almost complete so why not use it - maybe its simple problem ? — yael, Apr 17 '12 at 21:00

score 1 · Answer 1 · answered Apr 17 '12 at 16:51

1

What the best option to do that.

The tool that you already know will probably be the best. If you are familiar with awk, then awk is fine. If you are familar with perl, python, ruby, whatever, then one of those may be good. You what appears to be a trivial programming task, pick your favorite tool.

answered Apr 17 '12 at 16:51

Zoredache

128,755
40
271
413

yes you right about this – yael Apr 17 '12 at 20:16

score 0 · Answer 2 · answered Apr 17 '12 at 14:53

0

If I want to do this, I will do it using a Perl script.

Read the first line from input file and keep it as header.
Skip one line.
Continue reading line at a time. For each line,
1. Split the line based on the defined delimiter using split.
2. Print the returned values along with the header files according to the needed format.

You need either to have a fixed delimiter between the fields such as \t or a fixed-length fields to be able to reliably split the fields.

answered Apr 17 '12 at 14:53

Khaled

35,688
8
69
98

what about awk seems to me fine lang to work with ... – yael Apr 17 '12 at 15:04
A simple split on headers/delimiter may be a problem here. If you split on whitespace for example, then you are going to get `24-09-2009` and `16:17:45` as separate values. – Zoredache Apr 17 '12 at 17:01
@Zoredache: yes, you are right. Because of that I said to use `\t` as example and not a space. – Khaled Apr 18 '12 at 06:52

Dennis Williamson · Answer 3 · 2012-04-17T23:35:14.980

0

This will accommodate any number of fields.

awk 'FNR == 1 {
        for (i = 1; i <= NF; i++) {
            header[i] = $i
        }
     }
     FNR > 2 {
        for (i = 1; i<= NF; i++) {
            print header[i], "->", $i
        }
        printf "\n\n\n%s\n\n\n", "--------"
     }' inputfile

It will need some adjustment to handle the fact that the timestamp includes a space. What separates the fields? If it's tabs only, then you can use -F '\t' or perhaps -F '\t+'.

edited Apr 17 '12 at 23:35

answered Apr 17 '12 at 16:59

Dennis Williamson

60,515
14
113
148

see my update in my question - awk not work in my solaris machine ? in ksh shell – yael Apr 17 '12 at 19:54
maybe the "}" is missing ?? – yael Apr 17 '12 at 20:32
@yael: Yes, there was a missing `}` before `FNR > 2`. I have edited my answer to include the correction. Why is your question tagged `[bash]` when you're using ksh (not that it matters in this case)? It would have been more important to indicate that you're using Solaris since that affects the version of AWK you're likely to have. – Dennis Williamson Apr 17 '12 at 23:38

linux + generate new file with specific structure from text file what the best option

3 Answers3