How to display certain lines from a text file in Linux?

Question

I guess everyone knows the useful Linux cmd line utilities head and tail. head allows you to print the first X lines of a file, tail does the same but prints the end of the file. What is a good command to print the middle of a file? something like middle --start 10000000 --count 20 (print the 10’000’000th till th 10’000’010th lines).

I'm looking for something that will deal with large files efficiently. I tried tail -n 10000000 | head 10 and it's horrifically slow.

possible duplicate of http://serverfault.com/questions/101900/read-specified-range-of-lines-from-a-file — Kyle Brandt, Apr 19 '10 at 12:20

score 138 · Accepted Answer · edited Sep 04 '22 at 17:10

138

sed -n '10000000,10000020p' filename

You might be able to speed that up a little like this:

sed -n '10000000,10000020p; 10000021q' filename

In those commands, the option -n causes sed to "suppress automatic printing of pattern space". The p command "print[s] the current pattern space" and the q command "Immediately quit[s] the sed script without processing any more input..." The quotes are from the sed man page.

By the way, your command

tail -n 10000000 filename | head -n 10

starts at the ten millionth line from the end of the file, while your "middle" command would seem to start at the ten millionth from the beginning which would be equivalent to:

head -n 10000010 filename | tail -n 10

The problem is that for unsorted files with variable length lines any process is going to have to go through the file counting newlines. There's no way to shortcut that.

If, however, the file is sorted (a log file with timestamps, for example) or has fixed length lines, then you can seek into the file based on a byte position. In the log file example, you could do a binary search for a range of times as my Python script here* does. In the case of the fixed record length file, it's really easy. You just seek linelength * linecount characters into the file.

^{* I keep meaning to post yet another update to that script. Maybe I'll get around to it one of these days.}

edited Sep 04 '22 at 17:10

Amir

3
2

answered Apr 19 '10 at 09:11

Dennis Williamson

60,515
14
113
148

1

Here is a `sed` version of Charles' `middle` function: `middle() { local s=$1 c=$2; shift 2; sed -n "$s,$(($s + $c -1))p; $(($s + $c))q" "$@"; }`. It will handle multiple file arguments, filenames with spaces, etc. Multiple files are processed together as if they had been catted in the same way that `sed` normally does (so middle 1000 100 file1 file2 would span across the end of the first file to the beginning of the second one if the first one has fewer than 1100 lines). – Dennis Williamson Apr 19 '10 at 15:55
The function in my previous comment can be called with a filename parameter: `middle startline count filename` or multiple filenames: `middle startline count file1 file2 file3` or with redirection: `middle startline count < filename` or in a pipe: `some_command | `middle startline count` or `cat file* | middle startline count` – Dennis Williamson Apr 20 '10 at 16:47
Shouldn't the ` in your sed command be a '? I can't get it to work with the backtick but it works fine with the single quote. – Ian Hunter Dec 18 '12 at 21:51
@beanland: Yes, it's a typo. I've fixed it. Thanks. – Dennis Williamson Dec 19 '12 at 05:57
1

@DennisWilliamson: `tail -n 10000` print the last 10000 lines. use `tail -n +10000` to start print at 10000th line. – cuonglm May 23 '14 at 12:16
FYI, print from 10th line to the end : `sed -n '10,$p'` – plhn Aug 26 '15 at 04:30
Some more explanation would be nice. E.g. what is the "p", what is the "q" for at the end of the line numbers? – kev Oct 08 '17 at 23:58
1

@kev: I added some explanation to my answer. – Dennis Williamson Oct 10 '17 at 16:19

score 35 · Answer 2 · answered Jun 17 '13 at 18:22

35

I found out the following use of sed

sed -n '10000000,+20p'  filename

Hope it's useful to someone!

answered Jun 17 '13 at 18:22

Dox

451
4
4

Good to know that there is an alternative to the last line argument proposed by Dennis: a line count as second `sed -n` argument which makes it quite readable. – Timo Feb 03 '18 at 09:46
An example usage: `extract_lines(){sed -n "$1,+$2p" }` which writes to stdout. – Timo Feb 03 '18 at 09:50

score 5 · Answer 3 · answered May 23 '14 at 12:11

5

This is my first time posting here! Anyway, this one is easy. Let's say you want to pull line 8872 from your file called file.txt. Here is how you do it:

cat -n file.txt | grep '^ *8872'

Now the question is to find 20 lines after this. To accomplish this you do

cat -n file.txt | grep -A 20 '^ *8872'

For lines around or before see the -B and -C flags in the grep manual.

answered May 23 '14 at 12:11

Dennis

51
1
3

While that is technically correct and an interesting way to do it on a reasonably-sized file, I'm curious about its efficacy when working with files of the size the poster is asking about. – Jenny D May 23 '14 at 12:37
Multiple lines: cat -n file.txt | grep "^\s\+$10\|20\|30$\s\+" – Jeffrey Knight Nov 11 '16 at 15:24
`cat -n file.txt | grep '^ *1'` yield all the lines that have 1 on their right side. How to output line 1 with this technique? I know I can head -n 1....but how to use grep? – EMBEDONIX May 26 '17 at 10:59

score 2 · Answer 4 · answered Apr 17 '15 at 19:48

2

Use the following command to get the particular range of lines

awk 'NR < 1220974{next}1;NR==1513793{exit}' debug.log | tee -a test.log

Here debug.log is my file which consists of a lacks of lines and i used to print the lines from 1220974 line number to 1513793 to a file test.log. hope it ll helpful for capturing the range of lines.

answered Apr 17 '15 at 19:48

newbie13

131
3

The same answer as https://serverfault.com/a/641252/140016. Downvoted. – Deer Hunter Apr 17 '15 at 20:32
1

It is not the same answer. This should be faster for large files as it actually aborts after printing the last line instead of continuing scanning through the file. – phobic Jul 06 '18 at 08:53

score 1 · Answer 5 · answered Apr 19 '10 at 15:08

1

Dennis' sed answer is the way to go. But using just head & tail, under bash:

middle () { head -n $[ $1 + $2 ] | tail -n $2; }

This scans the first $1+$2 lines twice, so is much worse than Dennis' answer. But you don't need to remember all those sed letters to use it....

answered Apr 19 '10 at 15:08

Charles Stewart

650
6
19

Using `$[...]` is deprecated, at least in Bash. Also, you're missing a file parameter. – Dennis Williamson Apr 19 '10 at 15:46
@Dennis: No missing parameter: you're meant to use this on stdin, as per `middle 10 10 < /var/log/auth.log`. – Charles Stewart Apr 20 '10 at 16:33

score 1 · Answer 6 · answered May 22 '18 at 12:24

1

Perl is king:

perl -ne 'print if ($. == 10000000 .. $. == 10000020)' filename

answered May 22 '18 at 12:24

Peter V. Mørch

812
7
15

shardan · Answer 7 · 2014-05-23T13:13:03.793

0

A ruby oneliner version.

ruby -pe 'next unless $. > 10000000 && $. < 10000020' < filename.txt

It can be useful to somebody. The solutions with 'sed' provided by Dennis and Dox is very nice, even because it seems faster.

edited May 23 '14 at 13:13

answered May 23 '14 at 12:58

shardan

321
1
8

score 0 · Answer 8 · answered Oct 31 '14 at 22:02

0

For instance this awk will print lines between 20 and 40

awk '{if ((NR > 20) && (NR < 40)) print $0}' /etc/passwd

answered Oct 31 '14 at 22:02

Hrvoje Špoljar

5,162
25
42

Dagelf · Answer 9 · 2016-03-23T15:24:35.977

0

If you know the line numebrs, say you want to get line 1, 3 and 5 from a file, say /etc/passwd:

perl -e 'while(<>){if(++$l~~[1,3,5]){print}}' < /etc/passwd

edited Mar 23 '16 at 15:24

answered Mar 23 '16 at 13:36

Dagelf

589
4
14

score -1 · Answer 10 · answered May 10 '21 at 11:22

-1

To answer @sean87 question:

cat -n file.txt | grep '^ *1' yield all the lines that have 1 on their right side. How to output line 1 with this technique?

Just add \s after the number, like this:

cat -n file.txt | grep '^ *1\s'

answered May 10 '21 at 11:22

jfc

1
1

Just comment on the answer you're responding to. This isn't an answer in its own right. – Jacktose Sep 21 '22 at 20:30

score -1 · Answer 11 · edited Oct 31 '14 at 21:53

-1

You can use 'nl'.

nl filename | grep <line_num>

edited Oct 31 '14 at 21:53

sysadmin1138

131,083
18
173
296

answered Oct 31 '14 at 19:35

Ajay

9

This is not good : if you request line 42, you will get all the lines which contain that number. – Samuel Faure Jan 04 '20 at 11:12

How to display certain lines from a text file in Linux?

11 Answers11