Grooper 21.00.0082 is available as of 12-12-2023! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.00.0044 is available as of 06-20-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0026 is available as of 09-16-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 24.0.0012 is available as of 10-10-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Ignoring certain table cell values (return blank)
hjanum
Posts: 113 ✭✭
We have a table on one of our forms. It looks like this:
As you can see the second and third columns expect a date in the form of MM/YYYY. Unfortunately, many that fill in the form choose to write something like 'N/A', 'NA' or '-' in the field. This causes validation issues as our Grooper configuration expects the date field. This is also what our downstream systems want. We export the data directly to those systems using Groopers XML export.
The table is pretty standard. It is defined as follows. This is an 'Infer Grid' table that does per cell OCR using the lines in the table.
The From column is defined as follows. I tried to put a value extractor in, but Grooper seems to ignore it when you are doing the cell level OCR.
I could add a new set of 'shadow columns' and hide the original columns. I appended an X after the shadow columns. The shadow columns are just calculated values like If(From="N/A","",From)
I can hide the original columns, so only the calculated columns show, but I don't like this solution because when you are in Data Review Grooper does not jump to the extract location in the document because the value is calculated.
Any ideas on how I can replace the 'filler values' with blank values for the OCR read? It would be really easy if I was somehow going through an extractor after the cell level OCR.
As you can see the second and third columns expect a date in the form of MM/YYYY. Unfortunately, many that fill in the form choose to write something like 'N/A', 'NA' or '-' in the field. This causes validation issues as our Grooper configuration expects the date field. This is also what our downstream systems want. We export the data directly to those systems using Groopers XML export.
The table is pretty standard. It is defined as follows. This is an 'Infer Grid' table that does per cell OCR using the lines in the table.
The From column is defined as follows. I tried to put a value extractor in, but Grooper seems to ignore it when you are doing the cell level OCR.
I could add a new set of 'shadow columns' and hide the original columns. I appended an X after the shadow columns. The shadow columns are just calculated values like If(From="N/A","",From)
I can hide the original columns, so only the calculated columns show, but I don't like this solution because when you are in Data Review Grooper does not jump to the extract location in the document because the value is calculated.
Any ideas on how I can replace the 'filler values' with blank values for the OCR read? It would be really easy if I was somehow going through an extractor after the cell level OCR.
Tagged:
0
Answers
Product Manager
mharrison@bisok.com
As I mentioned any 'phantom columns' I configure that are calculated or derivative using extractors are not user friendly, as when you are in Data Review Grooper does not jump to the extract location in the document.