Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Value Lookup

Embedded Object Grooper.Extract

Performs value filtering, normalization, and replacement using vocabularies, exclusions, and fuzzy matching.

Remarks

Overview

The Value Lookup class is used to validate, correct, and normalize values after extraction in Data Fields and other extractors. It provides a flexible mechanism for enforcing allowed values, discarding unwanted values, correcting spelling or OCR errors, and mapping values to standardized forms or abbreviations.

How It Works

  • Vocabulary:
    The 'Vocabulary' property defines the set of allowed values. Any value not present in the vocabulary is discarded unless the 'Allow Miss' option is enabled.
  • Exclusions:
    The 'Exclusions' property defines a set of disallowed values. Any value in this list is always discarded, even if it appears in the vocabulary.
  • Fuzzy Lookup:
    When enabled, values not found in the vocabulary are checked for close matches using fuzzy matching. This can automatically correct minor spelling or OCR errors.
  • Lookup Options:
    Additional options control behavior such as allowing misses, cleaning punctuation, using list case, and translating values using key-value pairs in the vocabulary.

Workflow

  1. Each value is checked against the exclusions list. If found, it is discarded.
  2. If the value is not excluded, it is checked against the vocabulary.
  3. If not found and fuzzy lookup is enabled, the closest match above the minimum similarity threshold is used.
  4. If 'Translate' is enabled, values are mapped to their replacement using key-value pairs in the vocabulary.
  5. If 'Allow Miss' is enabled, values not found in the vocabulary are still included in the output.
  6. Additional options control case normalization and punctuation handling.

Usage Scenarios

  • Strict Validation:
    Enforce that only values from a controlled list (such as codes, states, or product names) are accepted.
  • Spell Correction:
    Automatically correct common OCR or typographical errors using fuzzy matching.
  • Abbreviation Mapping:
    Map full names to abbreviations or vice versa using the 'Translate' option and key-value pairs in the vocabulary.
  • Exclusion Filtering:
    Discard unwanted values (such as "N/A", "Unknown", or blacklisted terms) using the exclusions list.

Properties

NameTypeDescription
Lookup Lexicons
Lookup Settings

Derived Types

There are 1 implementations of Value Lookup.

Group Options Defines lookup and filtering options that apply to a named group of values, supporting group-specific vocabularies, exclusions, and confidence thresholds.

See Also

Used By

Notification