Grooper 21.00.0082 is available as of 12-12-2023! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0016 is available as of 03-15-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.00.0042 is available as of 03-22-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Document separate by value!
DungVu
Posts: 88 ✭
Currently we have a step in the process to do document separation by value change of invoice number. In the above screen-shot, the page #2 is actual an invoice but it was miss fired the invoice number and was append to previous folder. We have tried to setup multi separator with an addition of event based separator for page count of 1. But it doesn't work either. It put the loose page in a folder for classification but the multi page invoice has been split into single page folder. Please give us some direction or method to get this setup to get the loose page in it own folder and multi pages invoice will be in it own folder.
Thank you.
0
Answers
So here's something like how it should be working.
---New Document---
Page 1 - invoice number:01
Page 2 - invoice number:01
--- New Document---
Page 3 - invoice number:02
--- New Document----
Page 4 - invoice number:03
Page 5 - invoice number:03
Page 6 - invoice number:03
---New Document----
Page 7 - invoice number:04
But it's doing something like this?
---New Document---
Page 1 - invoice number:01
Page 2 - invoice number:01
Page 3 - invoice number:02
--- New Document----
Page 4 - invoice number:03
Page 5 - invoice number:03
Page 6 - invoice number:03
Page 7 - invoice number:04
If some pages aren't being detected by your separation extractor, you'll want to add another Data Type that will find something unique to the page you want it to separate on. You'd just need to see more examples of that document to see how consistent they are. The Facsimile example page doesn't have a label to anchor the Invoice Number off of, so it would need to be pretty specific to avoid grabbing other numbers.
The key is to identify anything on the page that tells you that you're looking at page 1 of a document. You already have the "Change in Value" method for Invoice number, creating a new folder whenever a new invoice number appears. You could add a pattern-based separation extractor that finds and splits on any instance of the phrase "PAGE 1 of". You can then add new patterns to this extractor whenever new formats come in that fail to separate.