Grid Extractor - How To Identify Row Headers

tcoates · February 2018

Does someone have an example of how to configure a Grid extractor for a table? I am trying to understand how to define the X and Y extractors.

I am trying to improve a table extraction where values in some columns are missing (meaning there is nothing printed on the document). I have made some of the data groups variable, but since I do not know which columns might be blank the row extractor does not seem to be the best option.

For example:
Column1 Column2 Column3 Column4
Date Item Cost Comments
1/1/18 WidgetA $23.00 Outdoor Rated
1/2/18 $45.00 Not In Stock
1/3/18 Widget B Not for Sale
1/4/18 Widget C -$10.00 Refund Amount
1/5/18 WidgetA $23.00
1/6/18 WidgetA $23.00 Enjoy

Thanks

GrooperGuru · February 2018

I'm glad you've brought up this discussion. This is a common problem in table extraction that are row-based logic is not designed to handle elegantly. There are a few options that MAY work in your scenario, though the configuration is pretty difficult to explain. The shortest answer is... this is a MAJOR enhancement in the upcoming version of Grooper (2.7). We will have new methods for capturing table data that are based on horizontal/vertical cross sectioning. These will use header extractors to understand the X position of columns. Then row extractors to find the Y location of the data. For now, I'm going to PM you a short powerpoint that explains these new features. We will soon have a full article in x Change, but it is a very big topic to cover.

For now, I may have some workarounds. Do you have time this morning for a 15 minute call?

Grid Extractor - How To Identify Row Headers

Answers