3

I have a rather large project which will ultimately benefit society, and I'm looking for all the help I can muster.   I have about 130,000 pages that need to be digitized. Many of them are in packages that have staples, or are on paper that is 40 plus years old (and is quite thin compared to today’s paper). Some of it is oddly sized (full size legal, maps, and small postcard sizes..). However, we have only ~10 days to process this work (once we arrive on site). We could work through the night.  

I have a team of 6, and we have a relatively small budget to accomplish this task. We’ve considered modern scanners (such as a feed-tray fujitsu scansnap), which can process pages at ~25ppm (pages per minute), but we are concerned about pages being torn or caught (and we are trying to not jeopardize originals). There is also the question of the staples (which could be removed...). We could do flatbed, but whoa, that's a huge job to do manually! We could always do this for the very large pieces.

I'm hoping you folks have some very clever ideas on how to accomplish this...   Thank you so much for your time and help 


EDIT It seems that a combination approach (fine paper scanner + vertical copy stand) would work best so as to ensure the req'd pages/minute. One offline suggestion: A photocopier? What do we assume would happen if we simply photocopied the whole collection first, then either had the copier send a digital onwards, or copy the photocopy in a scanner. It seems like doublework to me, but I'm not familiar enough with the guts of the tech to know better.

Gryph
  • 418

3 Answers3

6

If you simply need facsimiles of these and do not care so much about perfect presentation, consider a camera attached to a vertical copy stand.

Guaranteed not to jam, easily adjusted for different media, reasonably straight for OCR, and far faster than a consumer flatbed.

A homemade one can be quite cheap and you can then simply drop the stack under the camera, adjust the camera so that the frame is maximally filled, and then start flipping the pages, taking a shot of each.

Auto-focus should handle any depth change, and you would never need to remove the staples/binders/etc.

Might be cheap enough you can get all 6 people working cameras.

Two things to bear in mind:

An 8.5 x 11 page @150ppi filled with random noise, rgb is going to be about 1MB jpg compressed, so you are going to need at least 200GB of free storage.

130,000 / 6 people / 10 days / 8 hours a day/ 60 minutes per hour = 5 scans per minute. I think this is doable for a camera, but not a consumer-grade flatbed scanner.

enter image description here

Yorik
  • 4,988
5

I can't answer what scanner to get, I can however speak from experience as an ex-worker who prepared, scanned and archived documents of all shapes and sizes that paper is rarely fragile and any tears are hard to spot in the digital copy.

Staples are a pain to deal with, depending on how important the corners are. If they are important not to be damage it can take 4-15 seconds to remove one depending on how stubborn they are, some also like to explode so please cover the staple with you hand to avoid eye damage.
There are two different kinds of tools for removing staples, one with metal teeth and one that just a kind of stick that you slide under the staple and then keep sliding until the staple is out.
The toothed one is way slower but rarely tears the paper and the sliding one is fast but is more likely to tear the corner.

An experienced team would handle 130K papers 150-225 man hours, inexperienced team might be double, depending on how the paperload needed to be handled. But the important part is to always keep the scanner running.

The advice I would give about the scanner and scanning is that it's very important to provide the workload to the person who is scanning in an efficient way. Collect the papers and run them together with some separators between the different documents. Split the documents in post if the scanner can't do it live.
You're really going to need a "paper jogger" in order to avoid papers messing up the orientation in the machine. WAAYY faster and better results then a human simply shaking the papers. But I only have experience with one machine so I don't know how to tell a good from a bad without using it (if there are bad ones).
It's more important to have scanner which is easy to load then it is to have a high PPM rate (everything is relative). If you can't load a 25ppm scanner with 25ppm then it's not really 25ppm worth of work you're getting. You really want to be able to load hundreds of papers at once to keep the machine rolling.

If there's any more things you're wondering about I'll try to answer those too.

3

A few thoughts on removing staples

For standard document scanners you need to remove staples.

If the paper edge next to the staple does not contain any information you could consider to just cut the edge off together with the staple. The simplest and fastest way is to use a paper cutter with a lever. Rotary paper cutters are less ergonomic and slower for that purpose. With your amount of stapled documents you will soon get sore fingers if you use scissors for that purpose, especially if you have thicker stapled documents.

If you want to retain the edges, you have the choice among quite a number of different shapes of staple removers. To remove hundreds of staples a plier-shaped staple remover probably offers the best ergonomics and is the safest for the paper originals. The advantage is that it has a lever, so you need less force. Jaw-shaped removers do not have a lever. As a consequence, you need much more force and will soon get a cramp in your hand and sore muscles in the arm; the same with tongue-shaped staple removers. The risk to damage the paper with jaw-shaped ones is very high, with tongue-shaped a bit less. With jaw-shaped ones you often need to "bite" under the staple from both sides of the paper pile, especially if the paper pile is thicker and the staple long. In that case, it will take you a long time to get the staple out.

With a good plier-shaped staple remover one "bite" from the top side of the paper pile is often enough to remove the staple in one go. With the remover I use (Skrebba skre-klick) the risk of paper damage is minimal as is the force needed. But there might be others out there that are as good. With such a staple remover you are easily twice as fast as with the other two mentioned and you rarely will damage the paper.

Examples of staple removers mentioned above:

"Plier-shaped” enter image description here

“Jaw-shaped” enter image description here

“Tongue-shaped” enter image description here