Howto search in PDFs using regular expressions?

15

6

Usually I use Notepad++ to search in file(s) using regular expressions. Today I am wondering if there is a PDF program that does the same for PDFs. Of course I could convert the PDF to text and use Notepad++ but is there a more easy way without converting?

Michael S.

Posted 2012-03-15T05:35:02.513

Reputation: 3 128

1What OS are you using? – Scott McClenning – 2012-03-15T05:40:46.930

Windows Developer Preview and Windows 7 – Michael S. – 2012-03-15T05:41:57.203

Answers

9

several options:

akira

Posted 2012-03-15T05:35:02.513

Reputation: 52 754

1@akira What about in Linux? – Nikhil – 2018-08-07T01:25:56.493

4

  1. Agent Ransack is free (lite) and supports PDF as its release notes confirm.
  2. PowerGREP is a commercial product.

Just as you said, the evident alternative is to convert PDFs to text. One way for a programmer to set that up for bulk processing is by using the Python package PDFMiner. Agent Ransack uses "pdftotext" from the Xpdf project (and you can too).

minopret

Posted 2012-03-15T05:35:02.513

Reputation: 535

sidenote: Agent Ransack is the lite version of FileLocator – akira – 2012-03-15T06:28:41.483

Thanks! I looked more closely. The vendor's release notes confirm that File Locator Lite aka Agent Ransack does support PDF. Editing my answer. – minopret – 2012-03-15T06:41:30.817

Agent Ransack does the job. You might also want to try DnGrep. – Michael S. – 2012-03-15T08:08:43.417