E-aksharayan
e-Aksharayan is an optical character recognition engine for Indian languages. Some of research work from e-Aksharayan has been published in different conferences and journals.[1][2][3][4]
![]() | |
Written in | C++ |
---|---|
Operating system | Linux (32 & 64-bit), Windows (32-bit) |
Available in | Interface: English Recognition: Assamese, Bengali, Bodo, Devanagari, Kannada, Gujarati, Gurumukhi, Oriya, Malayalam, Meitei, Marathi, Tamil, Telugu, Tibetan and Urdu |
Type | Optical character recognition |
Website | ocr |
![](../I/m/Bangla_typos.png)
Bangla typos
Screenshots
- OCR output for Devanagari
- OCR output for Devanagari OCR output for Devanagari, sync between image and output
- OCR output for Devanagari OCR output for Devanagari, spell checker
gollark: https://www.sbert.net/examples/applications/semantic-search/README.html is kind of like what you want.
gollark: Instead of recomputing the embeddings every time a new sentence comes in.
gollark: The embeddings for your example sentences are the same each time you run the model, so you can just store them somewhere and run the cosine similarity thing on all of them in bulk.
gollark: Well, it doesn't look like you ever actually move the `roberta-large-mnli` model to your GPU, but I think the Sentence Transformers one is slow because you're using it wrong.
gollark: For the sentence_transformers one, are you precomputing the embeddings for the example sentences *then* just cosine-similaritying them against the new sentence? Because if not that's probably a very large bottleneck.
References
- Greedy Search for Active Learning of OCR Greedy Search for Active Learning of OCR
- Text graphic separation in Indian newspapers Text graphic separation in Indian newspapers
- An OCR System for the Meetei Mayek Script An OCR System for the Meetei Mayek Script
- Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.