E-aksharayan

e-Aksharayan is an optical character recognition engine for Indian languages. Some of research work from e-Aksharayan has been published in different conferences and journals.[1][2][3][4]

e-Aksharayan
Written inC++
Operating systemLinux (32 & 64-bit), Windows (32-bit)
Available inInterface: English
Recognition: Assamese, Bengali, Bodo, Devanagari, Kannada, Gujarati, Gurumukhi, Oriya, Malayalam, Meitei, Marathi, Tamil, Telugu, Tibetan and Urdu
TypeOptical character recognition
Websiteocr.tdil-dc.gov.in
Bangla typos

Screenshots

gollark: https://www.sbert.net/examples/applications/semantic-search/README.html is kind of like what you want.
gollark: Instead of recomputing the embeddings every time a new sentence comes in.
gollark: The embeddings for your example sentences are the same each time you run the model, so you can just store them somewhere and run the cosine similarity thing on all of them in bulk.
gollark: Well, it doesn't look like you ever actually move the `roberta-large-mnli` model to your GPU, but I think the Sentence Transformers one is slow because you're using it wrong.
gollark: For the sentence_transformers one, are you precomputing the embeddings for the example sentences *then* just cosine-similaritying them against the new sentence? Because if not that's probably a very large bottleneck.

References

  1. Greedy Search for Active Learning of OCR Greedy Search for Active Learning of OCR
  2. Text graphic separation in Indian newspapers Text graphic separation in Indian newspapers
  3. An OCR System for the Meetei Mayek Script An OCR System for the Meetei Mayek Script
  4. Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts Experiences of Integration and Performance Testing of Multilingual OCR for Printed Indian Scripts


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.