Permanently Change Bad OCR for Text Searchable PDF

RandoCalrisian · January 2018

Grooper obvously has several methods of getting around poor OCR upon extraction, but the bad OCR result is never actually changed on the source, just compensated for.
If I want to deliver a Text Searchable PDF that has said bad OCR, is there a way to correct the source, not just to extract a corrected result?

GrooperGuru · January 2018

http://xchange.grooper.com/discussion/102/ocr-correction

I moved my writeup to Deep Dives.

GrooperGuru · January 2018

Yes sir. The correct ocr activity has two modes of operation. One of them is designed to improve the text results in a specific data field after extraction and/or data review is completed. The other mode targets the actual source ocr text. In that mode, you will generally want to use this activity immediately after ocr so that all classification and extraction are performed against the improved text results. I'll post some screenshots and additional info in the morning when I return to my computer.

Permanently Change Bad OCR for Text Searchable PDF

Best Answer

Answers