5
2
The fonts of old papers (<2000) tend to look disheveled on my Linux box. Why is that?
There's the paper: http://acl.ldc.upenn.edu/H/H94/H94-1048.pdf
5
2
The fonts of old papers (<2000) tend to look disheveled on my Linux box. Why is that?
There's the paper: http://acl.ldc.upenn.edu/H/H94/H94-1048.pdf
7
This is almost certainly due to the scanning process (whether OCR was used or not). Journals started using electronic publishing relatively late. Most older papers have been scanned into PDFs from the original, printed paper version. That's why the fonts look weird to you.
What you are looking at are images taken of the fonts and then (maybe) passed through OCR software to turn them into text. Newer papers look better because they have been created as PDFs directly.
1Although your primary point is spot on, note that OCR will improve the appearance of fonts, not the opposite. OCR will turn the image into text (ASCII / Unicode / Et al), which when rendered, will use the exact same fonts that would be used for a document created today. – RockPaperLizard – 2015-06-02T12:09:23.513
So no way to fix it? – Reactormonk – 2013-03-24T02:02:18.443
Not as far as I know, no. – terdon – 2013-03-24T03:20:06.590
1Do they appear right on Windows boxes? Is it actual text, ie you can select copy and paste it, or is it an image? – Paul – 2013-03-23T08:24:14.310
4Check the document metadata. My guess is either an old version of some software (LaTeX?) that didn't properly support vector type, or they scanned paper documents and lost the original letter forms. – Daniel Beck – 2013-03-23T08:24:37.170
3@Paul OCR software could take care of making text selectable without changing how it looks. – Daniel Beck – 2013-03-23T08:25:09.560
1@DanielBeck Really? I haven't seen OCR software that retains the original appearance, that is a cool trick. – Paul – 2013-03-23T08:27:06.887
1
@Paul I took the image, converted it to PDF, sent it through OCR (via DEVONthink Pro Office on OS X, they call it "Convert to searchable PDF"), selected the text, and copied it into a text document. Screenshot of "Searchable PDF with selection, and text document. The two documents (original and searchable PDF) look identical.
– Daniel Beck – 2013-03-23T08:32:22.043@DanielBeck Yeah, very cool – Paul – 2013-03-23T11:20:55.453
I suppose you are talking about PDFs. Please include the output of
pdffonts
for one such file. – Reinstate Monica - M. Schröder – 2013-03-23T12:27:29.730@Paul I don't have any other OSs lying around. – Reactormonk – 2013-03-23T19:52:40.343