Install tesseract ocr 3 on OSX

0

I am trying to install Tesseract OCR on OSX 10.6...

I have got as far as installing leptonic (by downloading src and installing with ./configure; make; sudo make install) seemingly without any problems - but I don't know how to check.

I also installed Tesseract OCR 3 (from Google Code with ./runautoconf; ./configure; make; sudo make install) also seemingly without issue - but again I don't know how to check.

When I run tesseract input.jpg . I get error...

 bash-3.2$ tesseract ~/Desktop/DCIM/101_FUJI/DSCF1043.JPG . 
 Tesseract Open Source OCR Engine with Leptonica
 Error in pixReadStreamJpeg: function not present
 Error in pixReadStream: jpeg: no pix returned
 Error in pixRead: pix not read 
 Error in fopenReadStream: file not found 
 Error in pixRead: image file not found
 Image file ######
 Exif cannot be read! 

Similar error if I use tiff file as input.

I think I need some libraries - instructions for Ubuntu say to install libjpeg12-dev etc...

Does anyone have details of how to install tesseract on OSX?

Billy Moon

Posted 2011-09-20T17:51:34.847

Reputation: 226

Answers

2

Install macports: see http://www.macports.org/ for downloads and installation instructions.

Update the ports tree: sudo port selfupdate

Install tesseract: sudo port install tesseract

The tesseract port doesn't appear to have a variant that supports jpeg so you would need to install a graphic file converter and image adjustment (brightness, contrast and sharpness) package: sudo port install imagemagick

Convert your jpeg to tiff format, then perform OCR on it with tesseract: convert input.jpg input.tiff ; tesseract input.tiff ocr-text-ouput -l eng ; rm input.tiff

The resulting text should be found in the file ocr-text-ouput.txt.

p.s. you can adjust the image a bit for a potentially better OCR experience with convert options like these: convert -sharpen 1 -brightness-contrast 3X30 input.jpg input.tiff

tajh

Posted 2011-09-20T17:51:34.847

Reputation: 36

i had to install 'tesseract-eng' to get around segmentation fault 11. – Ian – 2013-06-04T15:23:24.490

2

I'm using homebrew on osx 10.7 and it was as simple as running these two commands:

brew install leptonica
brew install tesseract

This installed leptonica 1.68 and tesseract 3.01 with their dependencies

shig

Posted 2011-09-20T17:51:34.847

Reputation: 121

0

Working one command solution for me:

sudo brew install tesseract

This install tesseract and all dependencies. sudo necessary for some purposes, like jpeg packet linking.

user3291575

Posted 2011-09-20T17:51:34.847

Reputation: 1