How do I save an image PDF file as an image?

33

10

I have a PDF that contains a scan image of a document. I want to save the contents of this PDF as an image so that I can then run it through an OCR program that only accepts .jpg, .png, and .gif type files.

How do I save/convert this PDF to one of those image formats?

EDIT: One way I've found to do this is to click on each page. Copy to clipboard. Paste to Paint.net and then save. However, this is cumbersome as it appears you can only select one page at a time in Acrobat Reader.

Guy

Posted 2009-09-30T16:54:40.250

Reputation: 3 367

Answers

20

Please pay close attention to pooryorick's answer, in which he points out how sleske's answer is actually a much better answer for this particular problem.


Use GhostScript. This command works for me:

gs -dBATCH -dNOPAUSE -sDEVICE=png16m -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -r150 -sOutputFile=output%d.png input.pdf

There are multiple png pseudo-devices, differentiating on color depth: pngmono, pnggray, png16, png256, png16m, and pngalpha. Choose whichever one suits you the best.

You can also use jpeg, but unless you have a disk space issue, you want as high a quality as you can manage for your OCR, and that's not jpeg.

GhostScript no longer has support for gif, but I can't imagine why you'd need that, what with png256 support.

wfaulk

Posted 2009-09-30T16:54:40.250

Reputation: 5 692

Will the output be one huge image? – Xonatron – 2015-07-21T20:45:44.930

1@Xonatron: No. One image per page. The %d in the output file name is a variable that is replaced with the page number. (Almost certainly raw numbers, not the number inside the PDF.) – wfaulk – 2015-07-23T16:06:06.653

I love GhostScript, and if you want the convenience of a GUI for setting options, viewing, etc try GSview http://pages.cs.wisc.edu/~ghost/gsview/

– Dennis – 2009-09-30T18:37:28.263

20

Install Imagemagick. Open a cmd window or terminal:

convert myfile.pdf myfile.jpg

The output will be 1 jpg file for each page in your pdf, test-0.jpg, test-1.jpg, etc.

DaveParillo

Posted 2009-09-30T16:54:40.250

Reputation: 13 402

1Note on windows, make sure you install Ghostscript 32-bit first. – User – 2014-08-13T00:07:50.027

2

Be aware of the density, depth, and quality flags that can help you optimize your output. For example: convert -density 300 -depth 8 -quality 85 a.pdf a.png More info

– Nick – 2016-05-21T04:47:02.717

+1 for ImageMagick, but -2 for suggesting it for the wrong job. JPEG is good for photos, but it is the worst format to use when you have sharp egdes and high contrasts (as you typically have with black text/characters on white background). Also, ImageMagick does not do the conversion work itself, it uses Ghostscript in the background as its "delegate" slave. So doing it with Ghostscript directly gives you more control over the parameters used. And then choose TIFF (not JPEG) as the output format, for chris's sake! – Kurt Pfeifle – 2011-05-28T14:49:32.853

13

There's also pdfimages from the Xpdf tools (available from the site of XpdfReader). It will not convert a whole PDF page to an image, rather it will extract embedded images from a PDF.

This is useful if the PDF contains text and images, and you want only the images. Also, it will extract the images in their original format, so no loss of quality is involved (unlike programs which render the whole page and then convert it to e.g. JPEG). Depending on your needs this might be useful.


Simple usage:

pdfimages -j -list mydocument.pdf mydocument-images

This will read the input file mydocument.pdf, extract all images and write them to individual files named mydocument-images-0000.jpg, mydocument-images-0001.jpg etc.

Option -j makes it write embedded JPEG-compressed images as JPEG files, not as PBM/PGM/PPM files (which are uncompressed and huge). Note that images may still be written as PBM/PGM/PPM files, if that's how they were stored in the PDF input file.

sleske

Posted 2009-09-30T16:54:40.250

Reputation: 19 887

For reference, simple usage is pdfimages -j "yourinputfile.pdf" "outputimages" which will make "outputimages-0000.ppm" (or "outputimages-0000.jpg" if they're the right format). .NET examples can be grafted from here or here

– drzaus – 2017-08-18T18:31:43.473

A caveat is that it might not be able to save the file as a JPG, but rather a PPM – drzaus – 2017-08-18T19:26:27.900

11

You can do this using adobe reader:

  1. Click the image. It will be highlighted.
  2. Copy (Ctrl-C) and paste it into Paint.
  3. Save as any file type you like.

Hemant

Posted 2009-09-30T16:54:40.250

Reputation: 1 390

2interesting to know, Adobe Reader has a setting to override the dpi of images taken with the snapshot tool, when set to 300dpi, you'll get snapshots that are ready for print (by default the screen resolution is taken, which generally is too low to re-use in other work) – Stijn Sanders – 2009-09-30T17:49:19.827

3+1 for simplicity. Most PDF reader allow you to do this. – Decio Lira – 2009-09-30T17:49:47.360

4What if your PDF has 10000 pages of images? Do you have to do this 10000 times? – Guy – 2009-10-01T04:51:34.390

9

Except for the answer mentioning pdfimages, all of the other answers fail to mention that their solutions actually transcode the embedded images. I.e., those solutions do not simply extract the original image, but modify it, possibly to the detriment of the image, during the process. Only pdfimages extracts the original image. This is true of Ghostscript, Imagemagick, Adobe Reader, PDFFill, PDF Xchange Viewer, OS X Preview, and most other PDF software.

pooryorick

Posted 2009-09-30T16:54:40.250

Reputation: 191

Given the context of the question, this is actually a very good point. – wfaulk – 2015-07-23T16:09:06.997

FWIW, "PDFill PDF Tools" does allow you to set the DPI for the save-as-image, very handy. Thus each page (starting from text, images, whatever objects) gets saved, for example, to a high-res PNG at 4961x6520. – Chris O – 2015-10-03T15:59:11.323

4

PDFill PDF Tools is probably the easist way to convert your PDFs to images on Windows. It'll let you export all the pages in the PDF to separate images in one shot. It also has a lot of other features available for free, which are only available in other PDF viewers if you purchase the commercial or "Pro" version.

Use the "Convert PDF to Images" button (button #10) in the screenshot below.

PDFill PDF Tools screenshot

If you need to concatenate the images into one very tall image so you only have to feed one file to your OCR program, you can use IrfanView

rob

Posted 2009-09-30T16:54:40.250

Reputation: 13 188

note that this will install two different tools on your system. The main one being PDFill Editor, which is the one you don't need. Go into start menu to open this one. I was saved by the screenshot realizing that something was wrong before I uninstalled. – ufotds – 2011-05-07T19:59:04.783

Yes, I guess I failed to mention that it also installs a shareware version of PDFill Editor, as well as a PDF printer. Any files created with PDFill Editor will have a watermark unless you buy the editor for $19.99, but the PDFill PDF Tools Free utility doesn't require any purchase. In the version I have, you can't uninstall PDFill Editor without also uninstalling PDFill PDF Tools Free, but having PDFill Editor installed doesn't harm anything. – rob – 2011-05-09T18:08:52.833

2

Since you didn't include an OS tag I'll include an OSX answer:

PDFs by default open in Preview.app which allows you to use File -> Save-As:

  • GIF
  • ICNS
  • JPEG
  • JPEG-2000
  • BMP
  • OpenEXR
  • Photoshop
  • PNG
  • TGA
  • TIFF

Lake

Posted 2009-09-30T16:54:40.250

Reputation: 391

1

(Non-free) Acrobat professional does this:

Advanced->Document Processing->Export all images...

ufotds

Posted 2009-09-30T16:54:40.250

Reputation: 581

1

Also PDF Xchange Viewer (Free) will do export-to-file. File → Export → Export to image.

Not only that, but I think it's the best free PDF viewer for Windows, and it has some nice markup capabilities. I have a license for Adobe Acrobat and I still prefer this unless I'm doing extensive editing, which is rarely.

wfaulk

Posted 2009-09-30T16:54:40.250

Reputation: 5 692

This looked promising, until I discovered that the option to export to image is disabled fro password-secured PDFs. – Mitch – 2016-09-30T07:19:51.183

0

If the file is less than 5MB and you aren't worried about privacy/confidentiality, then is a handy online service at http://www.go2convert.com/ that can do a lot of graphic conversions (including pdf to jpeg)

sgmoore

Posted 2009-09-30T16:54:40.250

Reputation: 5 961

Just tried and it gave this error message "Sorry! This image could not be converted correctly." – Guy – 2009-10-01T04:54:43.337

-1

If the image exceeds the size of you screen, you may use FastStone Capture (the "Capture Scrolling Window" feature) and save the image as a JPEG.

alt text

Molly7244

Posted 2009-09-30T16:54:40.250

Reputation:

That's a very roundabout way of grabbing an image. OP already has a better solution (mark page in Acrobat). – sleske – 2016-02-23T08:08:16.547

-1

You can check out this article.

It lists out 6 different ways to convert the pdf into images.

Convert PDF to JPG (The Web Way)

PDF to JPG Converters for The Desktop

noob

Posted 2009-09-30T16:54:40.250

Reputation: 759

erm.. Why downvoted? – noob – 2013-04-16T06:10:00.233