Extract Step Performance

I added a few lookup translation to my field extractor which now cause performance issue.  Will increasing the amount of threads for Grooper activity processing services reduce the extraction time?  Is the thread to core ratio 1 to 1?

Best Answer


  kylesouza
    edited May 27
    The thread to core ration is usually 1 to 2 (two threads per 64-bit core).
    As for your first question, I would think so, but I am not sure.
    Kyle Souza
    Data Wizard
    P&P Oil & Gas Solutions
  GrooperGuru
    Increasing threads will not make any one document extract quicker, but will allow you to run more document simultaneously. The extraction task for the entire document is performed by a single thread. If I'm understanding, all you changed was enabling translation within the Lookup Options of one or more Data Types. If that is the case, that really shouldn't have caused any measurable performance change. Is this the change you made or is it something else that is actually referencing an external database?
    Matt Harrison
    Director of Strategy
    [email protected]
  henryma
    Once I extract my pattern, I use that value to preform a lookup translation against my lexicon to make sure they are valid values.  Once I add that lookup translation it created the performance issue.
