I have a 30gb disk image of a borked partition (think dd if=/dev/sda1 of=diskimage
) that I need to recover some text files from. Data carving tools like foremost
only work on files with well defined headers, i.e. not plain text files, so I've fallen back on my good friend strings
.
strings diskimage > diskstrings.txt
produced a 3gb text file containing a bunch of strings, mostly useless stuff, mixed in with the text that I actually want.
Most of the cruft tends to be really long, unbroken strings of gibberish. The stuff I'm interested in is guaranteed to be less than 16kb, so I'm going to filter the file by line length. Here's the Python script I'm using to do so:
infile = open ("infile.txt" ,"r");
outfile = open ("outfile.txt","w");
for line in infile:
if len(line) < 16384:
outfile.write(line)
infile.close()
outfile.close()
This works, but for future reference: Are there any magical one-line incantations (think awk
, sed
) that would filter a file by line length?