How can I open a massive PDF?

4

I created a massive (7GB) PDF in R, and now every computer/program combination I try and open it with craps out. I'm not even sure why as the entire file fits easily in RAM on more than one of the machines I used.

I've tried Adobe Reader on Windows and OS X, and also QuickLook and Preview on os x; this was on machines with up to 16GB ram apiece, but every time the OS or application just crashes.

I can pretty easily get ahold of a Windows 8, Ubuntu (any version), or OS X 10.8-10.9 machine as need be.

I'm fairly confident that the file is fine, I've created other files in the same way, but with smaller datasets and those files opened fine.

Unfortunately I don't think I can split the file as it's just one big plot, so there's only one page in the PDF and I don't know of any way to split it without opening it. And generating many PDFs from smaller chunks of the input data isn't an option for me as I was supposed to be finding a program that would plot arbitrarily large datasets. Well, R succeeded in plotting the data (the plot is viewable while the R session is still running even), but the saved version of the output is pretty much unusable.

EDIT: So the SVG wound up being only 260MB. I'm guessing that R makes really inefficient pdfs which just isn't a problem with smaller datasets. The SVG is a little slow to open but it does open and that's all that I needed, thanks everyone.

If the first person to suggest an SVG wants to submit that as an answer, I'll accept it.

Camden Narzt

Posted 2014-03-02T22:59:05.543

Reputation: 639

Since you created the file, I'll assume you have the original content. Two techniques to make usable file sizes: If the size is due to embedded pictures, use higher jpeg compression to reduce their size or reduce their resolution. If it is simply a massive amount of text, split it into separate files. – fixer1234 – 2014-10-16T14:42:03.690

It may not be the size - could be you're just writing a junk file. Try a smaller size to see if it gets better when smaller, or is still unopeneable, to be sure you're barking up the correct tree... – Ecnerwal – 2014-03-02T23:26:14.930

I think acrobat is still 32bit internally. There is a limit of 8,388,607 objects due to this. Can you split the file? – Paul – 2014-03-02T23:27:36.407

addressed comments. – Camden Narzt – 2014-03-02T23:38:46.067

What are you plotting? 5 billion data points? Drop some. You should be able to get a plenty good graph with only a few thousand samples. – psusi – 2014-03-02T23:50:19.057

1.3 million actually, which is within the bounds of the above mentioned limit of 8,388,607 objects. – Camden Narzt – 2014-03-02T23:58:48.377

3Does it need to be a PDF? Would an SVG or something like that suffice? – Paul – 2014-03-03T01:17:03.420

It would, I'll try making an svg tomorrow when I get in and see if that turns out any better. – Camden Narzt – 2014-03-03T01:20:25.780

To avoid the crashing problem, you could try to open it in a 64-bit build of Ubuntu in the Evince or Okular viewers, or in pdf.js in a recent version of 64-bit Firefox. It should be possible to do the Firefox one on 64-bit Windows with a non-Mozilla build of 64-bit Firefox, too, and the KDE project has 64-bit Windows builds of Okular, but 64-bit Evince is probably a Linux-only thing for now. – allquixotic – 2014-03-03T03:01:02.410

3

Considering the sheer... unusualness of what you're using, is there any way you could put up the file? I have SOME ideas but this is one of those things that is pretty hard to replicate - off hand sumatra has unofficial 64 bit PDF reader builds you might want to try

– Journeyman Geek – 2014-03-03T03:04:25.133

1If you're ending up with 7GB PDFs, there's a good chance that you're simply using the wrong file format for the job. You're likely better off finding an alternative format. SVG was suggested, but I get the feeling you're going to want something else if you have that many data points. – Bob – 2014-03-03T03:32:23.087

@Bob Yeah, the SVG will likely be at least 7GB as well, depending on the nature of the plot, but I figure there might be a lightweight way to render something like that. If full vector output isn't necessary presumably a PNG or something would solve the problem. – Paul – 2014-03-03T04:22:59.500

There are tools that optimize PDF files sizes- have you tried any of these? – karancan – 2014-03-04T22:13:35.723

The pdf optimizing utility would still have to open the file, which doesn't seem to be very likely to work. Anyway I'm happy with the svg. – Camden Narzt – 2014-03-05T17:36:11.367

Answers

1

Since no one made an answer out of the "Just use svg" comment, I'll do it. It solved my problem even if it didn't solve the problem as posed.

Camden Narzt

Posted 2014-03-02T22:59:05.543

Reputation: 639

0

Like @gbc921, you should try to convert the file over to another file format that might be a little easier on your system. PDFs are not the way to go with large files. TXT format may be good if you are working with text only, but I don't know an alternative for images. Another thing to do is to keep the file off your computer hard drive so it doesn't take so long to read the file, but rather on a flash drive that uses flash storage and may be easier to load faster. Lastly, I would consider turning off all the programs that you aren't using while viewing/editing the pdf. You can use task manager on Windows, I don't know the function in OS X. Make sure you have screen brightness and other performance saving options on so your computer doesn't crash.

Hope this helps.

javathunderman

Posted 2014-03-02T22:59:05.543

Reputation: 188