Grooper 21.00.0082 is available as of 12-12-2023! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0016 is available as of 03-15-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.00.0042 is available as of 03-22-2024! Check the Downloads Discussion for the release notes and to get the latest version.
classify for document type Amortization Schedule.
DungVu
Posts: 88 ✭
I have an issue to setup classify for document type "Amortization Schedule". Is there any special setting for training this type of document?
0
Best Answers
-
GrooperGuru Posts: 481 admin@DungVu I'm guessing from the screenshot you posted that it does not correctly classify because the confidence score is too low. There is not a right or wrong way to correct this, but a few possibilities. Here are a couple quick options.
- Since the Amortization Schedule doc type is at least the top choice in the candidate list, but with 55%, you could lower the "Minimum Similarity" property on the Content Model itself to something less than 55%. However, this could lead to other documents falsely classifying as something they are not.
- You could add a "Positive Extractor" on the Amortization Schedule Doc Type that finds the tile of the document. I would use a pattern like:
([\n\f]|^)Amortization Schedule\r
This may or may not be the right choice. If that title is on every page of that document, and you are doing ESP style separation, you would then also need to set this doc type to Combine Contiguous Documents so that all pages go together as one document. However, if there truly are two distinct amortization schedules back to back in a batch, then they would get lumped together as one document.
Matt Harrison
Product Manager
mharrison@bisok.com5 - Since the Amortization Schedule doc type is at least the top choice in the candidate list, but with 55%, you could lower the "Minimum Similarity" property on the Content Model itself to something less than 55%. However, this could lead to other documents falsely classifying as something they are not.
Answers