Software Requirements
The following software packages are available for both Windows and Linux systems, and are required for a complete, working solution:
- gvim - Used to export syntax highlighted source code to HTML.
- moria - Colour scheme for syntax highlighting.
- wkhtmltoimage - Used to convert HTML documents to PNG files.
- gawk and sed - Text processing tools.
- ImageMagick - Used to trim the PNG and add a border.
General Steps
Here is how the solution works:
- Load the source code into an editor that can add splashes of colour.
- Export the source code as an HTML document (with embedded
FONT
tags).
- Strip the background attribute from the HTML document (to allow transparency).
- Convert the HTML document to a PNG file.
- Trim the PNG border.
- Add a small, 25 pixel border around the image.
- Delete temporary files.
The script generates images that are all the same width for source files containing lines that are all under 80 characters in length. Source files with lines over 80 characters long result in images as wide as necessary to retain the entire line.
Installation
Install the components into the following locations:
- gvim -
C:\Program Files\Vim
- moria -
C:\Program Files\Vim\vim73\colors
- wkhtmltoimage -
C:\Program Files\wkhtml
- ImageMagick -
C:\Program Files\ImageMagick
- Gawk and Sed -
C:\Program Files\GnuWin32
Note: ImageMagick has a program called convert.exe
, which cannot supersede the Windows convert
command. Because of this, the full path to convert.exe
must be hard-coded in the batch file (as opposed to adding ImageMagick to the PATH
).
Environment Variables
Set the PATH environment variable to:
"C:\Program Files\Vim\vim73";"C:\Program Files\wkhtml";"C:\Program Files\GnuWin32\bin"
Batch File
Run it using:
src2png.bat src2png.bat
Create a batch file called src2png.bat
by copying the following contents:
@ECHO OFF
SET NUMBERS=-c "set number"
IF "%2" == "" SET NUMBERS=
ECHO Converting %1 to %1.html...
gvim -e %1 -c "set nobackup" %NUMBERS% -c ":colorscheme moria" ^
-c :TOhtml -c wq -c :q
REM Remove all background-color occurrences (without being self-referential)
sed -i "s/background-color: #......; \(.*\)}$/\1 }/g" %1.html
ECHO Converting %1.html to %1.png...
wkhtmltoimage --format png --transparent --minimum-font-size 80 ^
--quality 100 --width 3600 ^
%1.html %1.png
move %1.png %1.orig.png
REM If the text file has lines that exceed 80 characters, don't crop the
REM resulting image. (The book automatically shrinks large images to fit.)
REM The 3950 is the 80 point font at 80 characters with padding for line
REM numbers.
SET LENGTH=0
FOR /F %%l IN ('gawk ^
"BEGIN {x=0} {if( length($0)>x ) x=length()} END {print x;}" %1') ^
DO (
SET LENGTH=%%l
)
SET EXTENT=-extent 3950x
IF %LENGTH% GTR 80 SET EXTENT=
REM Trim the image height, then extend the width for 80 columns, if needed.
REM The result is that all images will be resized the same amount, thus
REM making the font size the same maximum for all source listings. Source
REM files beyond the 80 character limit will be scaled as necessary.
ECHO Trimming %1.png...
"C:\programs\ImageMagick\convert.exe" -format png %1.orig.png ^
-density 150x150 ^
-background none -antialias -trim +repage ^
%EXTENT% ^
-bordercolor none -border 25 ^
%1.png
ECHO Removing old files...
IF EXIST %1.orig.png DEL /q %1.orig.png
IF EXIST %1.html DEL /q %1.html
IF EXIST sed*. DEL /q sed*.
Improvements and optimizations welcome.
Note: The latest version of wkhtmltoimage properly handles overriding the background colour. Thus the line to remove the CSS for background colours is no longer necessary, in theory.
@Dave Jarvis: why is
wkhtmltoimage
and setting the width of the page not enough? the height can not be specified since it is determined by the content of the html stuff. imho width is all you actually need, you can calculate the needed width based upon how many pixels per inch you want. – akira – 2010-11-21T12:01:17.683@Dave Jarvis: well, just tell me how munch inches you want to cover and i tell you how much pixels you will need. 'trimming' the result with convert afterwards is a nice idea but destroys the idea of 'dpi' somewhat. you always start with "i need to fill this x inch of space and i want it filled with z dots per inch" .. and based upon that formula you request pixels. – akira – 2010-11-21T18:54:14.893
@akira: The width is dependent on the number of columns the source code uses. Sometimes the width will be 75 characters. Sometimes it will be 40 characters. So 75 characters should take up about 5.5 inches and 40 characters should be slightly more than half that. The 5.5 value depends on the margins of the book, which are subject to change (once or twice). This is a calculation that needs to be done automatically, by the way, otherwise the solution cannot be automated, which defeats the entire purpose. – Dave Jarvis – 2010-11-21T19:40:30.850
@Dave Jarvis: yep, i understand your problem. you are lucky with convert that the output of webkit in your case is really scalable and thus you could 'resize' the pdf afterwards. for an integrated solution i suspect one would need some kind of zoom-level AND the width of the 'browser' – akira – 2010-11-21T20:52:01.173
btw, what is the document format you are using to create the ebook or the printed book (latex, xsl-fo .. etc?) – akira – 2010-11-21T20:59:55.377
@akira: OpenOffice, but possibly LyX (LaTeX) or Scribus later. – Dave Jarvis – 2010-11-21T22:39:16.960
@Dave Jarvis: well, we both agree on that having vim yielding something close to the end would be better. maybe one should take the ToHtml code as a base, it does not look too complicated imho. btw, wkhtmltoimage yields .svg as well, maybe you can integrate that more easily into openoffice? – akira – 2010-11-22T08:39:21.030
agreed, but maybe .svg is worth a try. – akira – 2010-11-22T14:43:09.440