Batch convert HTML to PDF with custom paper size

1

I have around 200 HTML files that I need to convert to PDFs. The tricky bit is that I need to convert it to a custom paper size (in this case, 5.4" x 7.2"). Acrobat 9 won't let me batch convert to a custom size. Is there any way I can do this without going in and manually printing each individual file to PDF?

I primarily use Windows 7, but I'm not averse to a (Debian) Linux solution.

nc4pk

Posted 2012-03-24T02:13:21.383

Reputation: 8 261

Answers

2

On linux, HTMLDOC or html2ps will do the work for you. HTMLDOC might give prettier results and can output directly to PDF. Should go like for htmlfile in *html; do htmldoc $htmlfile; done

Noam Kremen

Posted 2012-03-24T02:13:21.383

Reputation: 691

2Another htmldoc user here. A couple things: it has the --size pagesize option, you probably want --size 5.4x7.2in. To avoid dedicated title pages and TOCs, use --webpage. --fontsize size lets you pick a size. --bodyfont serif will use the serif font — this is needed because, somehow, the default font htmldoc uses causes issues when rendering links (the underline is simply placed in the wrong place). Also, note that htmldoc expects iso8859-1, so if the documents have incompatible encodings (i.e. something other than iso8859-1 or ASCII), use, e.g., iconv. – njsg – 2012-03-24T10:23:21.963

Also, with htmldoc, the emphasis is on generating a printed, for reading, copy of the document (I usually even set wider margins and remove some stuff from the code (like tables)), so the result is not focused on faithfulness. As wkhtmltopdf actually uses a rendering engine, that should be something to look at if your goal is a "1:1" copy of what's on the screen. – njsg – 2012-03-24T10:27:48.697

1Thanks! my commmand line looked something like the following: for html in *.html; do htmldoc --size 5.4x7.2in --top 0.1in --bottom 0.15in --left 0.15in --right 0.15in --bodyfont serif --footer ... --webpage -f "$html".pdf "$html"; done – nc4pk – 2012-03-24T14:04:21.487

2

I use WKHTMLTOPDF, which lets you set the paper size and makes very accurate PDFs.

You can use a bash script to do many in one go, as Noam explained.

Alasdair

Posted 2012-03-24T02:13:21.383

Reputation: 545

Thanks a lot. I think I may have finally found a way to generate something better than the Firefox printing feature! It allows me to do the same that htmldoc does, perhaps with a way better rendering. Also, --enable-javascript, sounds like something useful when printing weirder sites. – njsg – 2012-03-24T10:40:22.613

0

Windows utility for batch converting HTML files to PDF documents...

http://sourceforge.net/projects/html-to-pdf/

The 'Paper Size' control on the user interface accepts numeric input, allowing you to specify a custom paper size in postscript points. So, for 5.4 x 7.2 inches you'll need to enter "0 0 389 518" for portrait.

The utility also enables you to control JavaScript, font embedding, 'live links' and other stuff.

AffineMesh

Posted 2012-03-24T02:13:21.383

Reputation: 632