Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Labeled Value

Value Extractor Grooper.Extract

Extracts a field presented as a label-value pair within a document, associating labels and values based on their spatial relationship.

Remarks

The Labeled Value extractor is designed to capture data fields that appear as label-value pairs in documents, such as "Invoice Total: $1,234.56". It works by matching sets of labels and values, then determining which pairs are grouped together based on their spatial arrangement and proximity.

How It Works

  • The extractor identifies candidate labels and values using the configured extractors or Label Sets.
  • It evaluates geometric clustering, ensuring that the label and value are the only occupants of the bounding rectangle (within a noise tolerance).
  • The spatial relationship (direction and distance) between label and value must meet specific requirements to be considered a valid pair.

For example, to extract an "Invoice Total" field, configure the label extractor to match variants like Invoice Amount, Invoice Total, or Amount Due, and the value extractor to match currency values. If multiple labels and values are found, Labeled Value determines which value is associated with each label.

Using Label Sets

Labeled Value is Label Set-aware. To use Label Sets instead of a label extractor, leave the 'Label Extractor' property blank. Ensure a [Labeling Behavior] exists on the Content Model, and that a Label Set is defined for each document type. The extractor must be defined on a Data Field or its descendants; it will not work from a local or global resources folder.

When configuring a Label Set, three field-level labels affect extraction:

  • Header: The text label for the field. If defined, this overrides the label extractor.
  • Footer: An optional label marking the end of the field content. If defined, this overrides the footer extractor and allows extraction of content between header and footer labels, with or without a value extractor.
  • Static: Specifies a static value to capture as the field value. If defined, this overrides the value extractor. Static labels require a match on the document and contribute to classification scoring when Labelset-Based classification is used.

Configuration Guidance

  • Assign appropriate label and value extractors, or use Label Sets for more flexible, document-type-specific matching.
  • Adjust the 'Maximum Distance' and 'Maximum Noise' properties to control how far values may be from labels and how much extraneous content is tolerated.
  • Use the 'Match Limit' property to improve performance on large documents with many repeated values.

Diagnostic Artifacts

When extraction is performed, Labeled Value logs diagnostic information. The following artifacts may be generated:

  • Number of header, footer, and value hits found.
  • Details about which labels and values were paired.
  • Information about noise levels and spatial relationships.
  • Notes on extraction logic used (e.g., whether Label Sets or extractors were used).

These diagnostics are useful for troubleshooting extraction results and optimizing configuration.

Properties

NameTypeDescription
General
Layout
Output

See Also

Used By

Notification