26
8
I'm trying to copy and paste text from a PDF file.
However, whenever I paste the original text it is a huge mess of garbled characters. The text looks like the following (this is just one small extract):
4$/)5=$13! ,4&1*%-! )5'$! 1$2$)&,$40! 65))! .*5)1! -#$! )/'8*/8$03!
(4/+$6&4;0!/'1!-&&)0!*0$1!.9!/,,)5%/-5&'!1$2$)&,$403!5'!+*%#!-#$!
0/+$!6/9! -#/-! &,$4/-5'8! 090-$+! 1$2$)&,$40! .*5)1!1$25%$! 1452$40!
/'1! &-#$4! 090-$+! 0&(-6/4$! %&+,&'$'-0! *0$1! .9! /,,)5%/-5&'!
1$2$)&,$40!-&1/97!"#$!+5M!&(!,4&1*%-!)5'$!/'1!,4&1*%-!1$2$)&,$40!
65))! .$!+*%#!+&4$! $2$')9! ./)/'%$13! #&6$2$43! -#/'! -#$!+5M! &(!
&,$4/-5'8!090-$+!/'1!/,,)5%/-5&'!1$2$)&,$40!-&1/97!
)*+*+, C<88,?>8513AG<5A14,
I've tried it in both Adobe and Foxit PDF readers. I did a 'Save as text' in Adobe Reader and the resultant text file is the same garbled text.
Any ideas how I can get this text out non-garbled? (Other than manual typing... there's a lot of text to extract.)
Similar question: http://superuser.com/questions/119393/search-pdfs-with-non-standard-character-encodings
– Hugh Allen – 2011-03-01T05:46:09.567I can also confirm this problem with OS X, at least as of 10.8.2. I've spent a bit of time going through the PDF file structure, but unfortunately I can't see any way to repair the damage. Acrobat Pro's "PreFlight" does report issues with the file when checking it against the PDF/A standard, and the Inventory report shows the glyphs being mapped against plainly wrong Unicode characters. I've raised a bug report with Apple - ID 12655651. I'll report back here if/when I get any updates. – KenD – 2012-11-08T09:48:22.813
Mught be helpful http://superuser.com/a/481510/153937
– Ankit – 2012-12-10T10:27:13.937Try some screen reader utilities (which works with jpeg, do a print screen and there you go) or here is a different way. (Just a 'guess', don't bite me for it. I used the first way back then. Hope there are more convenient ways).
– Apache – 2010-05-05T13:56:42.170