OCR Image based PDF

Question

Possible Duplicate:
Extracting text from a .PDF scanned book
How to do OCR on a PDF document?

I've got a >200 page pdf manual that was produced by scanning hard copy. I'd like to convert it to a searchable text format, but am not having any success finding a tool to do so. Google's search results are highly polluted with crippleware trial software that can only do the first few pages of the file. The only truly free application I found, FreeOCR's pdf renderer fails to handle anything beyond the first few pages of the file.

Google's pdf viewer does OCR; but doesn't appear to provide any export option other than copy/paste; in addition to being very tedious, what it puts on the clipboard is only plaintext; which means I'd lose all of the line art and significant formatting due to horizontal placement.

score 2 · Answer 1 · answered May 20 '12 at 16:19

If you upload your PDF to Google Drive (Docs) and have your upload conversion settings to convert images to text and then convert the document to a Google Doc (this can all be done at upload). You should then be able to open the doc, click file > download as and select the format you want?

I just did this is a magazine page and it worked okay, not all of the fonts were recognised though.

OCR Image based PDF

1 Answers1

Linked