sed command to consider first 10 digits from last coloumn

0

I have a file for eg with three coloumns

12345678910 14567855858855 12345678510750078

I want to consider only first 10 digits from the third coloumn with sed or awk.

expected ouput is:

1234567851

Please help

ltps

Posted 2016-10-02T06:11:15.470

Reputation: 21

Answers

0

You could try:

awk '{ print $3; }' subject.txt | sed -n 's/\([0-9]\{10\}\).*/\1/p'

the_velour_fog

Posted 2016-10-02T06:11:15.470

Reputation: 2 814

or awk '{print substr($3,1,10)}' subject.txt – dave_thompson_085 – 2016-10-02T13:01:39.627

1

This sed command will give you the first 10 digits of the last column.
Your question is a bit confused in terms of first or last 10 or 14 digits :-)
But, you can adjust likewise on this example.

$ echo "12345678910 14567855858855 12345678510750078" \ 
| sed -n 's/.*\s\([0-9]\{10\}\)[0-9]*$/\1/ p'

1234567851

Interpreting the command (so you can modify as required).

 sed -n 's/.*\s\([0-9]\{10\}\)[0-9]*$/\1/ p'
     |   | | | | |          | |       |   ^ print what remains on the matched line
     |   | | | | |          | |       ^^ replace the line with the part of interest
     |   | | | | |          | ^^^^^^^ match for the last column
     |   | | | | |          ^^ mark the end of part we want to print
     |   | | | | ^^^^^^^^^^^ this will match 10 digits at the start of the last column
     |   | | | ^^ start marking the part we want to print
     |   | | ^ start matching the digits after a white-space char
     |   | ^^ pattern begins matching everything up to the part of interest
     |   ^ process only lines that match the given pattern
     ^^ do not print the original input string

You can fine tune this for your data.
As it stands, because of the [0-9]*$ part in this rule your data is expected to have no whitespace or non-numeric characters after or inside the last column.

Update on your comment.
While this example uses an echo of your single line to demonstrate your test-case,
You can fire the command on your entire file as follows,

cat input-file.txt | <sed-command-above> > output-file.txt

or

<sed-command-above> input-file.txt > output-file.txt

The first form shows how an echo will work for the whole multi-line file too.
You could also do a short test with a head input-file.txt piped to the sed command to see how it works on the first 10 lines of you input file.

nik

Posted 2016-10-02T06:11:15.470

Reputation: 50 788

thanks for your answer.. but my file has some 100 lines. the one line i have given as a sample.. so please give me the sed command to get the first 10 digits of last coloumn not with echo.. – ltps – 2016-10-02T07:43:19.693

thanks for your answer but when i run this command sed -n 's/.([0-9]{10})[0-9]$/\1/ p' inputfile.txt i am not getting any output – ltps – 2016-10-02T07:54:17.243

when i run with this sed command sed -r 's/.*(.{10})$/\1/' 1.csv i am getting the output but it is considering the last 10 digits from the third coloumn i am gettng output as 510750078 for my input file 12345678910 14567855858855 12345678510750078 – ltps – 2016-10-02T07:57:50.480

Right, my bad, a space in the rule got lost while editing... I've changed that to a \s which is more appropriate and easier to not-lose. – nik – 2016-10-02T08:05:49.807

0

Perl to the rescue:

perl -lne 'print /(\d{10})\d*$/' < filename
  • -n reads the input line by line
  • -l adds newlines to output
  • $ matches the end of line, the first 10 digits preceding any other digits before the end are captured, and /.../ in the list context imposed by print returns that

choroba

Posted 2016-10-02T06:11:15.470

Reputation: 14 741

thanks for that but it is considering the last digits from reverse order.. i am supposed to get 1234567851 but the output gives 510750078 – ltps – 2016-10-02T06:44:34.357

1@ltps: Oh, so you want the first ten digits of the last column! You should always add the expected output to a question to make it clearer. Answer updated. – choroba – 2016-10-02T06:55:20.807

yes i want the first ten digits of the last column. sorry for that.. will be clear from now on – ltps – 2016-10-02T07:38:43.573

0

if you wand sed only solution, try:

cat /tmp/textfile |  sed -n -e '$!d;s/.*\s\([0-9]\{10\}\)[0-9]*$/\1/ p'

the substitute worked only on the last line.

Mario Goppold

Posted 2016-10-02T06:11:15.470

Reputation: 1