Creating and splitting large multipage TIFF images

7

3

I need to both create and split multipage TIFF images, ranging from 2 to almost 100 pages (A4, 300 dpi, 2500×3500 px). The job is performed periodically by a script on an x64 Linux server. Currently I'm using Imagemagick. The smaller cases do not pose any problems, but the larger ones do.

I need to radically reduce amount of memory used during the operation.

For example, this:

convert *.jpg -compress lzw output.tif

(70 jpeg files) consumes about 4.6 GB of RAM, even though each input is less than 2MB the resulting file is less than 250MB.

The reverse operation:

convert input.tif output-%04d.png

has similar issues.

From what I have read, this happens because Imagemagick first loads and decodes all the input images and only after that it starts encoding them into the output file.

How can I create and split multipage TIFF images without such huge memory footprint? I don't have to necessarily use ImageMagick, any other free tool will be fine.

Karol S

Posted 2014-08-06T12:35:41.987

Reputation: 228

To put some perspective on it: EACH 2500×3500 pixel image will take up 2500×3500×3 bytes at least as it resides in memory.That is 26250000 bytes per image, 1837500000 bytes total for 70 images. Then you create a DUPLICATE of that in the TIF, total 3675000000. Then you request to save it using lzw compression; some buffers is probably required for that. Maybe add buffers for writing... Handling 70-100 page files isn't easy, especially if the pages are nothing but bitmaps. – Hannu – 2014-08-06T12:59:25.070

@Hannu Not easy for who? The real world says that there's a concept of streaming transforms, and unpacking a huge image stack in memory simultaneously is lame and fugly. – polkovnikov.ph – 2016-03-25T00:13:12.757

The first example convert above creates a single PAGED tiff. Depending on how Imagemagick works internally, you MIGHT indeed have a "huge image stack in memory". – Hannu – 2016-03-28T08:52:20.370

Answers

4

I had the same problem today while trying to split a 1700 image, 1G tif file. 16G of memory wasn't enough, then tried having it cache on disk, but that was slow and it easily exhausted more than 100G on the harddrive without accomplishing anything (this was probably a bug).

But apparently Imagemagick can extract a specific tif from the original file without loading it completely, so was able to split the bigger file with a simple bash script:

END=2000
for ((i=1;i<=END;i++));do
echo $i
convert bigassfile.tif[$i] -scene 1 split/smallerfile_$i.tif
done

No idea though how to create a big file without running out of memory, so maybe this is half an answer?

tarikki

Posted 2014-08-06T12:35:41.987

Reputation: 141

3

I find @tarikki's answer one of the best, because it really doesn't hang the server nor does it eat RAM and disk space, and it's fast.

Some improvements that helped me:
1. replace END=2000 by END=$(identify -format "%n" bigassfile.tif)
2. The TIF index is 0-based, so the looping should start at 0 and use < instead of <= : for((i=0;i<END;i++))

MrMacvos

Posted 2014-08-06T12:35:41.987

Reputation: 31

This returns e.g. 666666, so a 6 for every of the six pages, on my system, so it's not directly usable. Any idea why? I opened a new question.

– Nicolai – 2019-07-26T23:07:51.873

1Yes, 666666 is correct output, as I also found out after posting here. What you see here is the total number of pages, listed for each page. To get only the first entry, I had to modify the code into : zzcount=$(identify -quiet -format "%n\n " "$1" | head -n1); Where $1 is the TIF file. – MrMacvos – 2019-07-27T00:32:45.343

0

tiffcp can be used to create a multipage tiff, like this:

tiffcp *.tif out.tif

Nicolai

Posted 2014-08-06T12:35:41.987

Reputation: 101