Grooper 21.00.0082 is available as of 12-12-2023! Check the  Downloads Discussion  for the release notes and to get the latest version.
Grooper 23.00.0042 is available as of 03-22-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0018 is available as of 04-15-2024! Check the  Downloads Discussion  for the release notes and to get the latest version.
Options

Ignoring certain table cell values (return blank)

We have a table on one of our forms. It looks like this:



As you can see the second and third columns expect a date in the form of MM/YYYY. Unfortunately, many that fill in the form choose to write something like 'N/A', 'NA' or '-' in the field. This causes validation issues as our Grooper configuration expects the date field. This is also what our downstream systems want. We export the data directly to those systems using Groopers XML export.

The table is pretty standard. It is defined as follows. This is an 'Infer Grid' table that does per cell OCR using the lines in the table.


The From column is defined as follows. I tried to put a value extractor in, but Grooper seems to ignore it when you are doing the cell level OCR.


I could add a new set of 'shadow columns' and hide the original columns. I appended an X after the shadow columns. The shadow columns are just calculated values like If(From="N/A","",From)


I can hide the original columns, so only the calculated columns show, but I don't like this solution because when you are in Data Review Grooper does not jump to the extract location in the document because the value is calculated.

Any ideas on how I can replace the 'filler values' with blank values for the OCR read? It would be really easy if I was somehow going through an extractor after the cell level OCR.

Answers

  • Options
    GrooperGuruGrooperGuru Posts: 481 admin
    I would just run a value extractor on the original columns that only finds valid date formats. Essentially, it would return nothing if it can't find a "valid" date. Then set the default value for those columns to N/A.
    Matt Harrison
    Product Manager
    mharrison@bisok.com
  • Options
    hjanumhjanum Posts: 110 ✭✭
    I would love to do that, but with the "Rubberband OCR Profile" feature used on this column, Grooper seems to ignore any Extractor I configure for the column at the top of the column definition. Any idea how to get an extractor involved when doing an "Infer Grid" type table?

    As I mentioned any 'phantom columns' I configure that are calculated or derivative using extractors are not user friendly, as when you are in Data Review Grooper does not jump to the extract location in the document.
Sign In or Register to comment.