Grooper 21.00.0082 is available as of 12-12-2023! Check the  Downloads Discussion  for the release notes and to get the latest version.
Grooper 23.00.0044 is available as of 06-20-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0024 is available as of 09-03-2024! Check the  Downloads Discussion  for the release notes and to get the latest version.

Context Scope for Feature Extractor

Is there a way to limit how far the engine reads when using the 'Flow' Context Scope? In this instance, I'm using the 'After' Direction Filter but that seems to encompass the entire document. In this instance, I know that I should locate the feature I'm after within just a few words of the Value that's been found...

So, if there were some way to set a maximum number of words and/or maximum line distance to consider when using the 'Flow' Context Scope, that would be quite nice. If that is already possible and I am just missing it, please let me know.

Thanks.

Answers

  • rmccutcheonrmccutcheon Posts: 756 ✭✭✭
    On a field class when you set the Context Scope to Flow a Max Instance setting will appear. This setting is the maximum number of nearby features to include.
    Here is a screenshot, let me know if this answers your question or not.


  • So, that's not quite what I'm after... that considers a maximum number of instances that matched the expression I created for the feature extractor...

    What I need to be able to do is limit the scope of the search for the feature to a certain distance after the value that was found and that distance needs to be measured in words or 'line distance' from the value found. So, If I define the Value Extractor to be a Date, then when a Date is found, continue past the date found looking for what I defined in my Feature Extractor, but stop when you reach the maximum allowed distance. Similar to the 'Nearest' scope but instead of a geometrical distance that can go in a 360 degree circle, I need a linear distance that's measured in some way that makes sense, for instance, words, could be inches or pixels or points if you measured linearly and wrapped from one line to the next.
  • [Deleted User][Deleted User] Posts: 0
    edited September 2018
    Here's an example as to why the current 'Max Instances' won't work, I need 'May 22 1992' but since 'Date Issued' appears within 'max instances' after 'June 29, 1977', that's a 100% match and completely wrong. If I could limit the Flow Context Scope based on a linear distance or number of lines or something of that nature, I could correct for that.

    And you're thinking, well then use Nearest and while that would work for this particular example, I have other values and features where that won't work either.
  • rmccutcheonrmccutcheon Posts: 756 ✭✭✭
    @GrooperGuru can you help answer Steve's question?
  • GrooperGuruGrooperGuru Posts: 481 admin
    @srosenhamer I've discussed this scenario with our dev team in the past. I know exactly what you're wanting to do, and I had at least verbally requested that the development team look into it. However, it may never have made its way into the formal feature suggestion queue. I'll get that added. In the mean time, there is a workaround, granted it isn't particularly elegant. Here's how it would work in your scenario.

    Essentially you need to start by building a Data Type. The goal of this is to find the results of your value extractor, in addition to the next X number of characters, words, or lines.
    Once that is created, you need to create another Data Type. This Second Data Type will use the first Data Type as its Input Filter. Now, from the second Data Type, you need to reference the Field Class you've already created. This will work exactly as you need it to, but requires several objects and references in order to function. And depending on the complexity of the actual value your looking for (like date, SSN, currency, etc.), the first Data Type may be a bit clunky as well.

    Let me know if this does or does not make sense based on my explanation. I can do a web session with you if needed. Thanks. -Matt
    Matt Harrison
    Product Manager
    mharrison@bisok.com
  • [Deleted User][Deleted User] Posts: 0
    edited September 2018
    @GrooperGuru Let me ask you a follow up question.. can the scenario you described be used in combination with already defined data types? For instance, the system 'Name' data type is great at finding all the names on a document but I need to limit it's search scope with the methods described... possible? or do I have to reconstruct those name types within the constructs you laid out?
    ** I somewhat answered my own question here, I was able to reference the name extractors in a data type using my other data type as an input filter... now if I can figure how to do that for the question in the next paragraph...

    Follow up to the follow up, I need to capture however many names there are together, usually separated the some version of the word 'and' but be able to limit my searching in a linear fashion as previously described... possible to build upon the system types or do you have to start from scratch?

Sign In or Register to comment.