How to get less to seek faster with large log files?

Question

I am often dealing with incredibly large log files (>3 GB). I've noticed the performance of less is terrible with these files. Often I want to jump do the middle of the file, but when I tell less to jump forward 15 M lines it takes minutes..

The problem I imagine is that less needs to scan the file for '\n' characters, but that takes too long.

Is there a way to make it just seek to an explicit offset? e.g. seek to byte offset 1.5 billion in the file. This operation should be orders of magnitude faster. If less does not provide such an ability, is there another tool that does?

if you're skimming the file for forbidden characters, is it a fair assumption that you will purge the aforementioned characters after finding them? If so, may I offer `perl -pi -e 's/\n//g;' ` — Mike Pennington, Jul 26 '12 at 00:43
Sorry, skim was the wrong word. Should have used scan. less by design scans for newline (\n). This scanning takes a very long time on large files. — UsAaR33, Jul 26 '12 at 06:42

score 24 · Accepted Answer · answered Jul 26 '12 at 11:16

24

you can stop less from counting lines like this less -n

To jump to a specific place like say 50% in, less -n +50p /some/log This was instant for me on a 1.5GB log file.

Edit: For a specific byte offset: less -n +500000000P ./blah.log

answered Jul 26 '12 at 11:16

Sekenre

2,913
1
18
17

1

Line counting was never the issue; I could just use escp/ctrl-c for that. But this is the actual answer; P jumps to a specific byte offset! – UsAaR33 Jul 26 '12 at 19:51

womble · Answer 2 · 2012-07-26T12:41:30.023

5

Less, being a pager, is inherently line-oriented. When you startup, if it's a large file it'll say "counting line numbers" and you hit ESC to stop that, but otherwise, it does lines. It's what it does.

If you want to jump straight into the middle of file and skip the beginning, you can always just seek past the beginning; I'd do something like tail -c +15000000 /some/log | less.

edited Jul 26 '12 at 12:41

answered Jul 26 '12 at 00:45

womble

95,029
29
173
228

3

you either mean `tail -c ...` or you have a weird `last` command. – Alan Curry Jul 26 '12 at 04:27
The problem with this strategy is that you can't seek in the whole file anymore from within less (searching for specific messages etc) – Sekenre Jul 26 '12 at 11:39
@AlanCurry: It's just an alternate spelling... – womble Jul 26 '12 at 12:41
When do I use `c` flag to go to a `byte`? Normally I would look for lines? – Timo May 20 '21 at 06:11

score 0 · Answer 3 · answered Jan 17 '17 at 15:22

less seems to have a small overhead from the locale settings

If you're using ASCII only characters, you can speed it up a bit by using:

LC_ALL=C less big-log-file.log

In my case, the throughput increased from ~ 30M ib/s to ~ 50 Mib/s (rate is CPU bound)

How to get less to seek faster with large log files?

3 Answers3