6
4
I have a scanned a book in PDF format, but the quality is rather poor:
(The language is Romanian and it's a medical physiology book, in case you were wondering)
I want to extract text from the book (1500 pages) but keep the images the way they are. I really don't think I have any chance to find a solution, so I'll surely buy the book.
On the offchance, is there any powerful software that can do what I'm looking for? It also has to recognize Romanian.
1buy it, it's legal. :) – None – 2009-11-01T23:02:13.273
What if this is a really old book he can't buy anymore? :) – Botond Balázs – 2009-11-03T07:55:26.807
@Botond, that is in fact a huge issue with Google Book Search. An estimated 70% of its books are in-copyright, but out-of-print. A class action settlement (negotiated between Google and a few lawyers working for the Authors Guild and AAP) states that for out-of-print Google does not need permission, unless the rights owners specifically opt out of the agreement. And, the way US law works, this is binding on every work of literature ever produced. As long as other companies do net get a similar deal, Google has a monopoly on old literature :-( See Boing Boing at http://tinyurl.com/yl5rlts
– Arjan – 2009-11-03T09:47:45.1301The problem of the OP is to extract text from a book. This is still a problem even if he has bought the book. Legal issues, though worth considering, are out of scope here. – mouviciel – 2009-11-10T08:42:39.177