Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

DedupMode

Grooper.Core

Specifies the mode used to deduplicate overlapping Data Instance results.

Remarks

Deduplication ensures that only a single Data Instance is retained when multiple results overlap in the document content. This is especially useful when multiple extractors or extraction techniques may produce redundant or overlapping results.

Common Deduplication Scenarios

  • Redundant Extraction:
    Multiple extractors may target the same value for redundancy. Deduplication ensures only one result is included in the output.
  • Preference by Confidence:
    When some extractors are preferred, configure them to output higher confidence. Deduplication by confidence will favor these results.
  • Self-Containing Values:
    When one result is a substring of another (e.g., "OWNERSHIP REPORT" vs. "MINERAL OWNERSHIP REPORT"), deduplication by length or area will retain the more specific match.

Usage

Set the deduplication mode using the 'Deduplication Mode' property. When enabled (not 'Disabled'), the 'Compare By' property is also exposed, allowing further control over how duplicates are detected.

Can be one of the following values:

NameValueDescription

Used By

Notification