PDF, PS and DjVu

This article covers software to view, edit and convert PDF, PostScript (PS), DjVu (déjà vu) and XPS files.

Engines

Poppler — PDF rendering library based on Xpdf. For CJK (Chinese, Japanese, Korean) support with Poppler, install poppler-data.

https://poppler.freedesktop.org/ || poppler

Mupdf — MuPDF is a lightweight PDF, XPS, and EPUB viewer, consisting of a software library, command line tools, and viewers.

https://mupdf.com/ || libmupdf

libspectre — Small library for rendering Postscript documents.

https://www.freedesktop.org/wiki/Software/libspectre || libspectre

Ghostscript — Interpreter for PostScript and PDF. Provides the gs(1) command-line interface, see also /usr/share/doc/ghostscript/*/Use.htm (online^{[dead link 2022-09-22 ⓘ]}), along with many wrapper scripts like ps2pdf and pdf2ps.

https://ghostscript.com/ || ghostscript

DjVuLibre — Suite to create, manipulate and view DjVu documents.

https://djvu.sourceforge.net/ || djvulibre

libgxps — GObject based library for handling and rendering XPS documents.

https://wiki.gnome.org/Projects/libgxps || libgxps

Viewers

Framebuffer

fbgs — Poor man's PostScript/pdf viewer for the linux framebuffer console.

https://www.kraxel.org/blog/linux/fbida/ || fbida

fbpdf — Small framebuffer PDF and DjVu viewer based on MuPDF, with Vim keybindings and written in C

https://repo.or.cz/w/fbpdf.git || fbpdf-git^AUR

jfbview — Framebuffer PDF and image viewer. Features include Vim-like controls, zoom-to-fit, a TOC (outline) view and fast multi-threaded rendering.

https://github.com/jichu4n/jfbview || jfbview^AUR

Graphical

Note: Some web browsers can display PDF files, for example with PDF.js.

Atril — Simple multi-page document viewer for MATE. Supports DjVu, DVI, EPS, EPUB, PDF, PostScript, TIFF, XPS and Comicbook.

https://github.com/mate-desktop/atril || atril

DjView — Viewer for DjVu documents.

https://djvu.sourceforge.net/djview4.html || djview

Foxit Reader — Small, fast (compared to Acrobat) proprietary PDF viewer. Releases (outside of security updates) are discontinued for Linux (November 2020).

https://www.foxitsoftware.com/pdf-reader/ || foxitreader^AUR

Sioyek — Lightweight PDF viewer based on MuPDF with features designed for viewing research papers and technical books.

https://sioyek.info/ || sioyek^AUR

Comparison

Name	PDF	PostScript	DjVu	XPS	PDF forms	PDF Annotation	Non-rectangle selection	License
Adobe Reader	Custom				Yes		Yes
apvlv	Poppler		DjVuLibre				^{(not by default, at least)}
Atril	Poppler	libspectre	DjVuLibre	libgxps	Yes
DjView			DjVuLibre
Emacs	Ghostscript*		DjVuLibre*			Yes	Yes
Emacs pdf-tools	Poppler					Yes	Yes
ePDFView	Poppler
Evince	Poppler	libspectre	DjVuLibre	libgxps	Yes	Yes	Yes
Foxit Reader	Custom				Yes	Yes	Yes
gv	Ghostscript
llpp	libmupdf			libmupdf	Yes
MuPDF	Custom			Custom	Yes ^(mupdf-gl)	Yes ^(mupdf-gl)	Yes ^(mupdf-gl)
Okular	Poppler	libspectre	DjVuLibre	Custom	Yes	Yes	Yes
pdfpc	Poppler
qpdfview	Poppler	libspectre*	DjVuLibre*		Yes	Yes
Xpdf	Custom
Xreader	Poppler	libspectre*	DjVuLibre*	libgxps*	Yes	Yes	Yes
Zathura	Poppler* / libmupdf*	libspectre*	DjVuLibre*	libmupdf*				zlib

* Optional dependency needs to be installed

PDF forms

The PDF forms column in the above table refers to AcroForms support. If you do not need your input to be directly extractable from the PDF, you can also use the applications in #Annotation or #Graphical PDF editing to put text on top of a PDF. PDF forms can be created with LibreOffice Writer (View > Toolbars > Form Controls) and the advanced PDF editors.

The proprietary and deprecated XFA format for forms is not fully supported by Poppler and only supported by Adobe Reader and Master PDF Editor.

Alternatively, web browsers such as Firefox or Chromium feature a built-in PDF viewer capable of filling out forms.

Annotation

Graphical PDF editing

Scribus can import and export PDF; text is imported as polygons.
LibreOffice Draw can import and export PDF; text is imported as text; embedded fonts are substituted.
Inkscape can import a single page from a PDF and export to PDF; text is imported as cloned glyphs or text; with the latter embedded fonts are substituted.
Graphics editors like GIMP and can also import and export PDFs at the cost of rasterization.

Basic editors

PDF Arranger — Helps merge or split pdf documents and rotate, crop and rearrange pages. It is a maintained fork of PDF-Shuffler.

https://github.com/jeromerobert/pdfarranger || pdfarranger

PDF Tricks — Simple, efficient application for small manipulations in PDF files using Ghostscript.

https://github.com/muriloventuroso/pdftricks || pdftricks

Cropping tools

PdfHandoutCrop — Tool to crop pdf handout with multiple pages per sheet.

https://cges30901.github.io/pdfhandoutcrop/ || pdfhandoutcrop^AUR

Advanced editors

PDF tools

Create a PDF from images

With GraphicsMagick:

$ gm convert 1.jpg 2.jpg 3.jpg out.pdf

Concatenate PDFs

With Ghostscript:

$ gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=out.pdf -dBATCH 1.pdf 2.pdf 3.pdf

With PDFtk:

$ pdftk 1.pdf 2.pdf 3.pdf cat output out.pdf

With Poppler:

$ pdfunite 1.pdf 2.pdf 3.pdf out.pdf

With QPDF:

$ qpdf --empty --pages 1.pdf 2.pdf 3.pdf -- out.pdf

Convert a PDF to text

With Poppler and maintaining the layout:

$ pdftotext -layout in.pdf out.txt

Decrypt a PDF

This section lists commands to decrypt a PDF to an unencrypted file. Note that most PDF viewers also support encrypted PDFs.

With PDFtk:

$ pdftk in.pdf input_pw password output out.pdf

With Poppler to PostScript:

$ pdftops -upw password in.pdf out.ps

With QPDF:

$ qpdf --decrypt --password=password in.pdf out.pdf

Encrypt a PDF

The user password is used for encryption, the owner password to restrict operations once the document is decrypted, for more information, see Wikipedia:PDF#Encryption and signatures.

With PDFtk:

$ pdftk in.pdf output out.pdf user_pw password

With PoDoFo:

$ podofoencrypt -u user_password -o owner_password in.pdf out.pdf

With QPDF:

$ qpdf --encrypt user_password owner_password key_length -- in.pdf out.pdf

where can be 40, 128 or 256.

Extract images from a PDF

With Poppler to JPEG:

$ pdfimages infile.pdf -j outfileroot

Extract page range from PDF, split multipage PDF document

With Ghostscript as a single file

$ gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=first -dLastPage=last -sOutputFile=outfile.pdf infile.pdf

With PDFtk as a single file:

$ pdftk infile.pdf cat first-last output outfile.pdf

With Poppler as separate files:

$ pdfseparate -f first -l last infile.pdf outfileroot-%d.pdf

With QPDF as a single file:

$ qpdf --empty --pages infile.pdf first-last -- outfile.pdf

With mutool as a single file:

$ mutool clean -g infile.pdf outfile.pdf first-last

Imposing a PDF

PDF Imposition (e.g. to combine multiple pages to one page) can be done with pdfjam, for example paper waste can be reduced with pdfnup and pdfbook can be used to arrange PDFs into a format suitable for book binding.

Inspecting metadata

With ExifTool:

$ exiftool file.pdf

With Poppler:

$ pdfinfo file.pdf

Optimize, reduce size of a PDF

With Ghostscript one of:

$ ps2pdf -dPDFSETTINGS=/screen in.pdf out.pdf
$ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -sOutputFile=out.pdf in.pdf

For different settings see the documentation.

There is also , a script wrapping gs.

Rasterize a PDF

With GraphicsMagick to convert a specific page:

$ gm convert -density dpi infile.pdf[page] outfile.jpg

With Poppler to convert all pages:

$ pdftoppm -jpeg -r dpi infile.pdf outfileroot

With Poppler to convert a specific page:

$ pdftoppm -jpeg -r dpi -f page -singlefile infile.pdf outfileroot

Splitting PDF pages

With mupdf-tools to split every page vertically into two pages:

$ mutool poster -y 2 in.pdf out.pdf

Can be used to undo simple imposition.

Add signature.png or image to one of the pages in the PDF

Adding an image to any location in a PDF can be done

with ImageMagick (convert), xv^AUR and . (Wrapper script)
with
with LibreOffice

Details on these and other solutions can be found on StackExchange.

Add digital signature to PDF

can digitally sign PDF files with X.509 certificates in GUI and CLI.

Readers such as Okular and MuPDF can sign PDFs with digital signatures. This requires a PFX certificate, which can be created with an OpenSSL command:

$ openssl req -x509 -days 365 -newkey rsa:2048 -keyout cert.pem -out cert.pem
$ openssl pkcs12 -export -in cert.pem -out cert.pfx

MuPDF users can then sign PDFs with the using the graphical interface, or its mutool-sign tool.

Okular users must import into a certificate store such as the one in the default Firefox profile. With Firefox this is done through Settings > Privacy & Security > View Certificates > Your Certificates > Import and selecting cert.pfx. Afterwards Okular will offer this certificate to be used when signing PDFs.

Libreoffice can also sign PDFs.

Removing annotations from a PDF

With perl-cam-pdf^AUR:

$ rewritepdf.pl -C in.pdf out.pdf

See https://superuser.com/a/1051543 for more information.

DjVu tools

DjVuLibre provides many command-line tools, like for example.

Convert DjVu to images

Break Djvu into separate pages:

$ djvmcvt -i input.djvu /path/to/out/dir output-index.djvu

Convert Djvu pages into images:

$ ddjvu --format=tiff page.djvu page.tiff

Convert Djvu pages into PDF:

$ ddjvu --format=pdf inputfile.djvu ouputfile.pdf

You can also use --page to export specific pages:

$ ddjvu --format=tiff --page=1-10 input.djvu output.tiff

this will convert pages from 1 to 10 into one tiff file.

Processing images

You can use to:

fix orientation
split pages
deskew
crop
adjust margins

Make DjVu from images

There is a useful script .

$ img2djvu -c1 -d600 -v1 ./out

it will create 600 DPI out.djvu from all files in directory.

Alternatively, you can try , which seems to create smaller files especially on images with well defined background.

PostScript tools

Ghostscript

ps2pdf

ps2pdf is a wrapper around ghostscript to convert PostScript to PDF:

$ ps2pdf -sPAPERSIZE=a4 -dOptimize=true -dEmbedAllFonts=true YourPSFile.ps

Explanation:

with you define the paper size. For valid PAPERSIZE values, see .
lets the created PDF be optimised for loading.
makes the fonts look always nice.

Note: You cannot choose the paper orientation in ps2pdf. If your input PS file is healthy, it already contains the orientation information. If you are trying to use an Encapsulated PS file, you will have problems, if it does not fit in the -sPAPERSIZE you specified, because EPS files usually do not contain paper orientation information. A workaround is creating a new paper in ghostscript settings (call it e.g. "slide") and use it as -sPAPERSIZE=slide.

Libraries

Python

PyPDF3 — A pure-Python library built as a PDF toolkit.

https://github.com/sfneal/PyPDF3 || python-pypdf3^AUR

gollark: Opera's based on Chromium's rendering stuff and is closed source.

gollark: Why do you have that much? More than two Chrome tabs?

gollark: GPU and CPU, apparently.

gollark: Folding@Home's client isn't actually open source, and who knows if they're just executing arbitrary code from people through it?

gollark: Personally I would be more worried about the security of this distributed computing stuff.

PDF, PS and DjVu

Engines

Viewers

Framebuffer

Graphical

Comparison

PDF forms

Annotation

Graphical PDF editing

Basic editors

Cropping tools

Advanced editors

PDF tools

Create a PDF from images

Concatenate PDFs

Convert a PDF to text

Decrypt a PDF

Encrypt a PDF

Extract images from a PDF

Extract page range from PDF, split multipage PDF document

Imposing a PDF

Inspecting metadata

Optimize, reduce size of a PDF

Rasterize a PDF

Splitting PDF pages

Add signature.png or image to one of the pages in the PDF

Add digital signature to PDF

Removing annotations from a PDF

DjVu tools

Convert DjVu to images

Processing images

Make DjVu from images

PostScript tools

ps2pdf

Libraries

Python

See also