PDF to text convertor

9

5

I'm looking for a "one-click" way of taking ANY PDF and converting it to plain text. Ideally on OSX or Linux.

Ideally, the solution would include OCR functionality, but it need not necessarily.

The top priority is having something that can take ANY file WITHOUT configuration.

themirror

Posted 2011-05-22T06:55:29.363

Reputation: 1 426

Question was closed 2015-01-19T15:19:25.880

Answers

23

There's xpdf which includes the pdftotext binary.

Pdftotext converts Portable Document Format (PDF) files to plain text.

On Linux there's a installer available. It seems that it also comes in the poppler-utils package. On OS X you could install it using Homebrew (install that first) and then use

brew install homebrew/x11/xpdf

which will download the source files and compile it for OS X. After that, just use it like:

pdftotext your_pdf_file.pdf

which will generate a plain text file. There are a couple of options as well, check out man pdftotext for more details.

An alternative is poppler, in OSX:

brew install poppler

in Debian and friends

apt-get install poppler-utils

slhck

Posted 2011-05-22T06:55:29.363

Reputation: 182 472

as of today the command is brew install homebrew/x11/xpdf – Diego Vieira – 2016-06-21T22:10:49.827

1@DiegoVieira Thanks. Next time feel free to suggest an edit! – slhck – 2016-06-22T11:04:40.190

some advantage using poppler instead of xpdf/pdftotext? – Gonzalo Bahamondez – 2016-06-23T02:40:21.410

brew install Caskroom/cask/pdftotext – Hugo – 2016-12-12T19:53:44.893

0

A nice tool for Windows is A-PDF Text Extractor

Michael S.

Posted 2011-05-22T06:55:29.363

Reputation: 3 128