0
I would like to scan some old text documents. My purpose is twofold: disaster recovery (e.g. fire), and to save space on bulky documents I rarely refer to (e.g. old phone bills).
After scanning I intend to destroy some of the originals, where I rarely refer to them and they are bulky. The rest I will keep and continue referring to. I do not intend to OCR the documents.
I estimate there are a few thousand sides of A4 to scan, and I am aiming for only a few failures (missed or illegible sides) per 1000 sides scanned. By illegible I mean text that a human cannot read reliably.
I would like to do this myself rather than using a commercial service.
I believe the documents are fairly typical of what home users will have collected in their filing cabinets over the past say 10 or 20 years:
- Mostly (perhaps 80%) standard paper size or close to standard size (A4, would be US letter elsewhere presumably)
- Some bills that are longer than A4 (less than 10%)
- A small number of "very miscellaneous" pages (less than 10%)
- Mostly relatively flat good quality paper
- The documents are printed on various papers since they include bills, receipts, letters, etc.
- Many but not all documents are printed on both sides
- A mixture of colour and in black and white only. Most of the documents do not use colour in an important way
- A minority of pages with some graphics and pictures, etc. (perhaps 5 or 10%)
- A minority of yellowed pages (less than 5%)
I would like to scan in colour because I do not want to verify that all of the colour information is unimportant. I will exclude large format documents (e.g. A3), but I would ideally like to scan bills that are longer than A4.
I don't mind scanning the "awkward cases" sheet-by-sheet but would like to save time using a sheet feeder where possible. However I anticipate that a high-end professional scanner isn't really called for. Also, as long as documents are still human-legible, damage to the paper is not very important.
Aside from dpi, what features in a scanner and sheet feeder are important for a job like this? By "features" I mean specific technical features (or performance characteristics) of the design, rather than broad categories like "reliability".
I am not looking for product recommendations. I would like to know what features are relevant for this scale of application.
you mean scanner device ? – TechLife – 2015-03-29T17:57:21.133
@fixer1234 I'm not looking for product recommendations (apart from off-topic this would be impractical since there are too many models and too varied availability). How is it an odd question? I'm completely unfamiliar with scanners and sheet feeders, I know that mechanical designs etc. vary, and would like to know what features are relevant for this scale of application. I don't consider price to be a feature exactly, but of course that constrains the relevant set of devices. – Croad Langshan – 2015-03-29T18:06:50.420
@TechLife: yes, a scanner to me is a kind of device (software would be "scanning software"). – Croad Langshan – 2015-03-29T18:08:39.773
1There are relatively inexpensive, consumer-grade sheet-fed scanners and commercial-grade scanners for high-volume work. Huge difference in cost and size. Will you need these requirements after the job is done? How much is your time worth and how much do you have? The main difference you would see would be speed and better paper feeding. The output quality would be comparable (you wouldn't know after-the-fact which machine they were scanned on). It's really a question of comparing features of what is available, weighing what's important to you, and investigating owner satisfaction. – fixer1234 – 2015-03-29T18:30:05.477
The scope is limited to what's in the question: I won't be doing other big jobs after this one. To give an idea, I'm very unlikely to spend more than 500 UK pounds on hardware (and likely significantly less than this). I did expect speed and paper feeding to be the areas that separate different devices, and I am definitely interested in those, since they will likely determine whether the project is practical. – Croad Langshan – 2015-03-29T18:43:01.420
1000 sides is two reams of paper. There are commercial scanners, probably over your price range, that would handle that in a few batches and scanning would be completed in under 20 minutes. They would also be better at handling non-pristine pages. Inexpensive consumer scanners might require on the order of 100 batches, plus more re-feeding if the originals are not in good shape. Even so, scanning might take only a few hours in total, although you might need to stretch it out to not exceed the scanner's duty cycle, and do some feed roller cleaning during the job. – fixer1234 – 2015-03-29T19:05:52.227
For once, I'll give in to the urge to recommend a specific product. I know this is (for very good reasons) off topic, but still I'd like to mention that I used Fuji's ScanSnap iX500 to scan 1000's of pages for similar goals. Price, quality, speed, size are well balanced. I'd happily recommend it. Ps: besides that I own one Fuji product, I have no interest, intentions or gain with this recommendation. Just wanted to share my positive experience. – agtoever – 2015-03-29T21:06:35.520
1Your question is very broad. There are many aspects to consider. To get more specific answers, you will need to be much more specific about your documents, e.g. are they on standard printer paper or on very thin paper. Is print both-sided or only one-sided. Are they in color or black and white only, do they contain graphics or pictures, etc. Is the paper yellowed. Are there some smaller formats in between. Do some pages have other paper glued on e.g. as accountants do it with sales receipts. – user291737 – 2015-03-30T12:20:34.923
What do you mean with "illegible sides". Do you want to read them with your eyes or with optical character recognition? That's quite a difference. Our eyes are capable to read low quality scans where OCR failed completely. – user291737 – 2015-03-30T12:48:47.680
If the quality of your documents vary a lot and you want "failure rate below perhaps 0.25%" you need a professional scanner (hardware + driver + software). To get this all at 500 GBP you need to consider buying a used professional scanner. – user291737 – 2015-03-30T13:24:07.317
It is false that answers to this question will tend to be "almost entirely based on opinions": see user291737's answer. I have responded to that user's comments by editing the question. – Croad Langshan – 2015-04-02T00:15:46.447
Don't forget that you lose information when you switch to an electronic format! Personal documents have characteristics that help to find them later, e.g. storage type (folders, boxes, file racks, drawers in all kind of materials/colors), location (shelves, cabinets in different rooms or even outside your house/flat), format size, and so on. Usually you know more or less where to look. In electronic form all these visual cues are lost! All folders have the same color and shape, all documents the same few icons. You should not underestimate this. – user291737 – 2015-04-03T14:18:52.617
@user291737 Thanks, I do appreciate that. My intent is 1. to get rid of old boring documents like old phone bills and 2. to help with disaster recovery (fire etc.). I don't intend to get rid of all of the old documents, mostly for the reasons you cite. – Croad Langshan – 2015-04-03T14:24:57.537
What I wanted to indicate with my remark above: You might need to consider beforehand how to replace the missing cues by other ones (e.g. OCR) in order to later find documents in your thousands of scans. And the decision about OCR has an impact on your scanning equipment. To aim for human readable only will cost you a lot of time finding your documents later. – user291737 – 2015-04-03T14:36:14.427
@user291737: Edited question to emphasize disaster recovery and getting rid of boring bulky documents (I should have done that to start with...) – Croad Langshan – 2015-04-03T15:39:42.747
I know that you are not looking for product recommendations but in the end information that shortens your decision making process might be helpful. PC Magazine reviews scanners on a regular basis (www.pcmag.com/reviews/scanners). It offers a relatively wide overview of scanners and their pros and cons in comparison. (I am not linked to PC Mag) – user291737 – 2015-04-03T16:29:51.957
There's a system occasionally advertised on TV designed for this kind of application (http://www.tryneat.com/site/tryneat/home.html; probably available from places like Amazon). It's a sheetfed scanner with feeders for different sized documents. It's optimized for these kinds of documents (and does double-sided scanning). However, it also does OCR as part of the process and the software does automated filing of the results. If you simply scan a couple of thousand sheets, you will never find a specific one if you actually need it. I've never used it, but it looks great on TV.
– fixer1234 – 2015-04-03T18:12:30.2231One other thought: you might be scanning way more than you need to. At least in the US, most things like bills and receipts have no purpose after varying times, but generally 5 yrs is the upper limit on those. Various legal documents should be retained longer, you might want to keep things like medical records, etc. Research document retention standards where you live. If your documents are 10-20 years old, it could be that a shredder would be more useful than a scanner. – fixer1234 – 2015-04-03T18:19:23.590
1+ to fixer1234. Your shredder and paper bin is your best friend in saving a lot of time! – user291737 – 2015-04-03T22:12:07.417
@fixer1234: thought takes time too :-) – Croad Langshan – 2015-04-05T19:11:03.507