Can text be extracted from a PDF with an “Invalid XRef entry” error?

I have a PDF which I’m trying to read, but won’t open in Adobe Reader. When using pdftotext, I saw it said “Invalid XRef entry.” PDFtk and Ghostscript haven’t been able to parse the file. I tried to repair it manually, but quickly realized that it was way over my head.

I was wondering if there’s any way to recover any text from the file? I can see a lot of the image resources, but none of the text is clearly there. Does anyone know if it can be recovered?

KnightOfNi

Posted 2015-10-01T01:44:14.977

Reputation: 177

Can we see the PDF file? – Edi – 2015-10-01T07:55:47.910

One of the most lenient readers in terms of handling broken PDFs is IMO the Chrome browser default PDF reader (based on pdfjs). Could give that a try and see if it renders your file – Edi – 2015-10-01T07:57:45.527

@Edi It just says "failed to load pdf document." That was a good thought, though. – KnightOfNi – 2015-10-01T21:39:28.147

Can text be extracted from a PDF with an “Invalid XRef entry” error?

Answers