Convert color photos of documents to good black-and-white (bitonal) images?

4

2

Since I don't have a copier or scanner, I'm using an 8 megapixel camera to copy documents. This works pretty well except they need a lot of processing afterward. I'd like to get from a photo to a bitmap, but using

djpeg -grayscale -pnm photo.jpg |
pgmtopbm -threshold -value XXX

does not work so well, for two reasons:

  1. It's hard to guess what XXX should be, and XXX is different for different photos.

  2. Illumination varies, and sometimes a single threshold isn't what's right for the image.

How can I do better? The ideal solution will be fully automatic command-line program that I can run on Linux. (I have already written a program to remove dark pixels from the edges of images.)

NOTE: I really want a bitmap, that's just black and white pixels. No grayscale, no dithering.

Norman Ramsey

Posted 2009-11-25T20:31:18.410

Reputation: 2 613

Answers

4

-monochrome

This option uses some smart dithering and generates very visible output:

convert -monochrome in.png out.png

Documentation: http://www.imagemagick.org/Usage/quantize/#monochrome

Compare that to a simpler -threshold 50 transform:

convert -threshold 50 in.png out.png

which loses most of the image.

Concrete example from: https://www.nasa.gov/mission_pages/galex/pia15416.html

wget -O orig.jpg http://www.nasa.gov/images/content/650137main_pia15416b-43_full.jpg
# Downsize to 400 height to have a reasonable file size for upload here.
convert orig.jpg -resize x400 in.jpg
convert -monochrome in.jpg out.jpg
convert -threshold 50 in.jpg threshold-50.jpg

in.jpg

enter image description here

out.jpg

enter image description here

threshold-50.jpg

enter image description here

Related questions:

Tested in Ubuntu 19.10, ImageMagick 6.9.10.

Ciro Santilli 新疆改造中心法轮功六四事件

Posted 2009-11-25T20:31:18.410

Reputation: 5 621

1

The best thing I've found in three years is the mkbitmap program that ships with potrace.

Norman Ramsey

Posted 2009-11-25T20:31:18.410

Reputation: 2 613

0

Apparently, Gimp supports some command-line batch processing. You might be able to give that a shot, since desaturating will probably behave like you'd expect with varying brightness in your images.

keithjgrant

Posted 2009-11-25T20:31:18.410

Reputation: 101

I'd be happy to try it; can you suggest which among the many hundreds of GIMP transformations might be relevant? – Norman Ramsey – 2009-11-27T22:42:34.633

0

Check out your camera. Many modern digital cameras have the ability to take B&W photos directly.

Gcoupe

Posted 2009-11-25T20:31:18.410

Reputation: 436

I'm looking for bitonal, not greyscale. – Norman Ramsey – 2009-11-27T22:41:48.690

0

Converting to grayscale / desaturating will preserve most of the noise too. The GIMP has a Threshold filter (under the Color menu) that eliminates the noise, and works very well for line-art and plain black scanned text.

I'm not too clued up on the batch scripting myself, but it sounds like a good idea to use the Threshold with it.

Edit: Since you have Linux as a tag, have a look at Phatch, batch photo manipulations. It has filters to adjust the contrast and brightness too. It's in the Ubuntu repos (if you use that distro)

invert

Posted 2009-11-25T20:31:18.410

Reputation: 4 918

OK, I checked out Threshold, and it does exactly what pgmtopbm does. If I wanted to adjust each page by hand, it would be great, but I really don't. At it completely doesn't solve the problem that I really need different thresholds in different parts of the image. Still, yours was the answer that most closely identified what GIMP can and can't do, so +1. P.S. It took me several minutes to find the thing among the goddamned menus. – Norman Ramsey – 2009-11-27T22:48:39.067

Apart from eyeballing the image, I can't say how to calculate the threshold values per image. Wow I'm stumped. Perhaps auto-adjusting the light levels first will put all images on the 'same level', and a common threshold value will then work? – invert – 2009-11-30T11:11:33.867