4

I've been trying to use ImageMagick with OpenCL to speed up resizing of images in batch.

For this, I've started a GPU instance (g2.2xlarge) on Amazon EC2, which according to AWS, features:

High-performance NVIDIA GPUs, each with 1,536 CUDA cores and 4GB of video memory

I've used a specific AMI for GPU instances, namely Amazon Linux AMI with NVIDIA GRID GPU Driver provided by NVIDIA.


With OpenMP

Before compiling ImageMagick from source, as a basis for comparison, I've tried the built-in ImageMagick, that only supports OpenMP:

$ convert --version
Version: ImageMagick 6.7.8-9 2015-10-08 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2012 ImageMagick Studio LLC
Features: OpenMP

I resized a 50 Mpx JPEG image to 25% of its size, and timed it:

$ time convert -resize 1158x1737 01.jpg 01b.jpg

real    0m1.371s
user    0m5.388s
sys     0m0.204s

I've run it several times to ensure that the timing is consistent (in particular because ImageMagick performs a benchmark of the devices performance on first use).


With OpenCL

I then downloaded the ImageMagick sources, and compiled them:

$ export C_INCLUDE_PATH=/opt/nvidia/cuda/include
$ ./configure --enable-opencl
$ make

I headed to the compiled binaries, and checked that OpenCL was now enabled:

$ ./convert --version
Version: ImageMagick 6.9.2-5 Q16 x86_64 2015-11-08 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC OpenCL OpenMP

Then ran the benchmark:

$ time ./convert -resize 1158x1737 01.jpg 01b.jpg

real    0m2.655s
user    0m1.720s
sys     0m0.928s

Again, I ran it several times to ensure that the timing was consistent.

To my surprise, this is half the speed as the version with OpenMP only.


Trying to make sense of it

As suggested in this StackOverflow answer, I checked the ImageMagick device benchmark file:

$ cat ~/.cache/ImageMagick/ImagemagickOpenCLDeviceProfile
<version>ImageMagick Device Selection v0.9</version>
<device><type></type><name>GRID K520</name><driver>340.32</driver><max cu>8</max cu><max clock>797</max clock><score>0.2780</score></device>
<device><type></type><score>1.4140</score></device>

Note: this file is only created when I run the compiled version of ImageMagick; for some reason, it's not created when I run the version that ships with Amazon Linux.

So as I read it, there are two devices that ImageMagick can use:

  • The GPU, recognized as a NVIDIA GRID K520, with a score of 0.278
  • An unknown device (the CPU?), with a score of 1.414

So as far as I understand it, the CPU outperforms the GPU here.

Ok, the CPU is not bad (E5-2670 @ 2.60GHz), but the GPU is quite a beast in its domain.


My questions

  • How can the compiled ImageMagick version be half as fast as the version that ships with Amazon Linux?
  • How can the CPU outperform the GPU in the ImageMagick benchmark?

Any hint would be welcome to regain the expected GPU performance.

BenMorel
  • 4,215
  • 10
  • 53
  • 81

1 Answers1

5
  • How can the compiled ImageMagick version be half as fast as the version that ships with Amazon Linux?

When using OpenCL it is not different initialization it is additional initialization; it will always take longer. We have the kernels precompiled of course but just getting the libraries loaded, making the command queues, loading the kernels... it all takes time. It's unfortunate, but "OpenCL mode" is not well suited for that type of one shot command line usage. An application or persistent server that can initialize the ImageMagick library once and make multiple calls into the library will do really well.

  • How can the CPU outperform the GPU in the ImageMagick benchmark?

You are reading the information wrong. A lower score means the device is faster. The GPU is nearly 6x faster. The term score can be a confusing in this situation so we might want to rename that in a future release of ImageMagick.

dlemstra
  • 166
  • 3
  • Thank you very much for this answer! I indeed read the information the wrong way. So if I understand you correctly, the command-line `convert` tool will never be able to make proper use of OpenCL? I precisely want to use ImageMagick as part of a real time server-side image processing service. Should I give up the idea of calling the command line from the web service? – BenMorel Nov 10 '15 at 19:10
  • 1
    You are right instead of the command line you probably want to use one of the API's (http://www.imagemagick.org/script/api.php) – dlemstra Nov 10 '15 at 20:22
  • Thanks, I just compiled PHP's [Imagick](http://php.net/manual/en/book.imagick.php) against my OpenCL ImageMagick above. I tried to reuse the same `Imagick` object several times, calling loops of `readImage()`, `resizeImage()` and `writeImage()` on the same object, but I can see no speed improvement whatsoever. Am I missing something? – BenMorel Nov 10 '15 at 21:34
  • I'm trying this out now and getting similar results. Even with mogrify resizing 100 images it's about 60% slower using opencl. – David Stone Mar 29 '18 at 23:46