1
1
I am trying to train Tesseract for some funny looking fonts, like Palace for example. I have tried a simple way - produced traindata with http://trainyourtesseract.com/ and then have made a call like
api->Init(".\\tessdata", "eng+Palace",OEM_TESSERACT_ONLY).
api->SetPageSegMode(PSM_SINGLE_LINE);
api->SetImage(image);
// Get OCR result
outText = api->GetUTF8Text();
The result for a line like
M P S T a o e h i l n p r s t u w y
is below, no glyph is correctly recognized:
.MDXXXo,XkX.n.mX.XnoX
Does trainyourtesseract make bad traineddata or do I make wrong calls, and how does one handle such cases?
Actualle, I have tried the same with less funny fonts, but also the recognition almost does not improve.
I am attaching the tiff file and my trained data for Palace.
Thank you everyone in advance for help, Yuliana
did you solve it? – V.Wu – 2020-01-07T06:05:11.753