Grooper 2.90.0063 is available as of 3-24-2021! Check the  Downloads Discussion  for the release notes and to get the latest version.
Grooper 22.00.0011 is available as of 9-19-2022! Check the  Downloads Discussion  for information on new features, and to download the latest build!
Grooper 21.00.0055 is available as of 9-22-2022! Check the  Downloads Discussion  for the release notes and to get the latest version.

Is there a way to edit text that was recognized incorrectly?

Grooper added a space to recognized text. Is there a way to delete that space.

The number on the document was 111.12 but was recognized as 111.1 2
Tagged:

Best Answer

  • RandoCalrisianRandoCalrisian Posts: 195 admin
    Answer ✓
    Short answer is, not easily, no.
    What is the significance of that number? What are you doing with it?
    Can I assume you're extracting it somehow? If so, fuzzy logic would easily remove that space.
    Randall Kinard
    [email protected]

Answers

  • AstonAston Posts: 17
    It's part of an ordered array, so it was looking for a number with two decimal places, so wasn't picking up the line. I changed the "Native Text Extraction" setting in the recognize step and it was able to read it correctly.

    Thanks
  • RandoCalrisianRandoCalrisian Posts: 195 admin
    Can I assume this was an electronic document that was being OCRed, and the Native Text Extraction is now giving you the proper text instead of the error OCR made?
    If you'd like a solution involving Fuzzy Logic please let me know.
    Randall Kinard
    [email protected]

  • AstonAston Posts: 17
    Yes. Some of the electronic documents we have been getting recently were created where a lot of the text are made to be images or something, so we're just OCRing everything right now.

    Would the fuzzy logic be set up in the extractor by using fuzzyregex?
  • RandoCalrisianRandoCalrisian Posts: 195 admin
    Yes.
    Your expression might look something like (keep in mind, you can't use infinite quantifiers with Fuzzy on):
    [0-9]{3}[.][0-9]{2}

    From there (assuming you're in 2021) the Fuzzy Matching property would be Enabled. Then in your Weightings property you could use the following:
    Delete( )=0.25

    This would make deleting a space inexpensive so you could keep your Minimum Similarity high.
    Randall Kinard
    [email protected]

Sign In or Register to comment.