How to output file from the specified offset, but not "dd bs=1 skip=N"?

28

8

How to do thing like dd if=somefile bs=1 skip=1337 count=31337000, but efficiently, not using not 1-byte reads and writes?

The solution is expected:

  1. To be simple (for non-simple I can write some Perl oneliner that will do this)
  2. To support large offsets and lengths (so hacks with block size in dd won't help)

Partial solution (not simple enough, trying the same with length will make it even more complex):

dd if=somefile bs=1000 skip=1 count=31337 | { dd bs=337 count=1 of=/dev/null; rest_of_pipeline; }
# 1337 div 1000 and 1337 mod 1000

Vi.

Posted 2012-01-20T21:20:02.420

Reputation: 13 705

Are you trying to change the blocksize that dd is using? – cmorse – 2012-01-20T21:24:56.923

Changed blocksize => changed units for skip and count – Vi. – 2012-01-20T23:09:40.390

Answers

38

This should do it (on gnu dd):

dd if=somefile bs=4096 skip=1337 count=31337000 iflag=skip_bytes,count_bytes

In case you are using seek= as well, you may also consider oflag=seek_bytes.

From info dd:

`count_bytes'
      Interpret the `count=' operand as a byte count, rather than a
      block count, which allows specifying a length that is not a
      multiple of the I/O block size.  This flag can be used only
      with `iflag'.

`skip_bytes'
      Interpret the `skip=' operand as a byte count, rather than a
      block count, which allows specifying an offset that is not a
      multiple of the I/O block size.  This flag can be used only
      with `iflag'.

`seek_bytes'
      Interpret the `seek=' operand as a byte count, rather than a
      block count, which allows specifying an offset that is not a
      multiple of the I/O block size.  This flag can be used only
      with `oflag'.

Ps: I understand this question is old and it seems these flags were implemented after the question was originally asked, but since it's one of the first google results for a related dd search I did, I though it would be nice to update with the new feature.

Fabiano

Posted 2012-01-20T21:20:02.420

Reputation: 516

2

Use one process to ditch all the initial bytes, then a second to read the actual bytes, e.g.:

echo Hello, World\! | ( dd of=/dev/null bs=7 count=1 ; dd bs=5 count=1 )

The second dd can read the input with whatever blocksize you find efficient. Note that this requires an extra process to be spawned; depending on your OS that will incur a cost, but it is probably smaller than having to read the files one-by-one byte (unless you have a very small file, in which case there wouldn't be a problem).

RolKau

Posted 2012-01-20T21:20:02.420

Reputation: 826

Surely it would be easier to use read -n to skip? And then head -c to count? E.g. cat somefile | (read -n 1337; head -c 31337000) Or you could do it without spawning an extra process: exec 3<somefile; read -n 1337 -u 3; head -c 31337000 <&3 – Gannet – 2015-04-11T03:49:37.597

Will it work well (i.e. don't hog too much memory) for large offsets and counts? dd if=/dev/sda bs=10000000001 | dd bs=255 count=1 | hd -> "dd: invalid number `10000000001'" – Vi. – 2012-01-21T06:03:54.690

@Vi. If you want to skip a huge offset then you should do the initial read as a series of "ideally" (depending on your source) sized blocks (16M), then drop a series of lesser size blocks (512) which will be in memory, to "zoom" in on your data, before you drop off an odd partion which doesn't fit the block size (bs=1 below) and then read the block you want.

E.g. you want to read 255 bytes from offset 10000000001:

dd if=/dev/sda bs=16M skip=596 count=1 | dd bs=512 skip=1522 count=1 | (dd bs=1 count=1 of=/dev/null ; dd bs=255 count=1) – RolKau – 2012-02-04T23:30:49.097

1

Instead of bs=1 use bs=4096 or more.

ccpizza

Posted 2012-01-20T21:20:02.420

Reputation: 5 372

It feels like the most reliable way is probably to write a custom executable. Some systems don't have Python, or Ruby, or even Perl. :| – Trejkaz – 2018-10-24T01:27:40.363

2Then it will read from offset 1337*4096 instead of 1337 – Vi. – 2012-01-21T06:02:00.467

1

Aha, I see, then it will probably be easier to write a simple Python script, e.g. like in this example http://stackoverflow.com/questions/1035340/reading-binary-file-in-python with f.seek(1337) before using read(MY_CHUNK_SIZE)

– ccpizza – 2012-01-21T19:03:55.123

1

You can try the hexdump command:

hexdump  -v <File Path> -c -n <No of bytes to read> -s <Start Offset>

If you simply want to see the contents :

#/usr/bin/hexdump -v -C mycorefile -n 100 -s 100
00000064 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 
00000074 00 00 00 00 01 00 00 00 05 00 00 00 00 10 03 00 |................| 
00000084 00 00 00 00 00 00 40 00 00 00 00 00 00 00 00 00 |......@.........| 
00000094 00 00 00 00 00 00 00 00 00 00 00 00 00 a0 03 00 |................| 
000000a4 00 00 00 00 00 10 00 00 00 00 00 00 01 00 00 00 |................| 
000000b4 06 00 00 00 00 10 03 00 00 00 00 00 00 90 63 00 |..............c.| 
000000c4 00 00 00 00 |....| 
000000c8 #

Saravanan Palanisamy

Posted 2012-01-20T21:20:02.420

Reputation: 11

It's not about viewing the file as hex. It's about extracting the content of a file (to copy it somewhere, for example) from the specified offset in bytes. – Vi. – 2014-07-13T23:39:10.780