Grooper Help - Version 25.0
25.0.0023 2,165
  • Overview
  • Help Status

StandardWeightings

Grooper.Core

Specifies a set of pre-configured fuzzy match weightings for common data extraction scenarios.

Remarks

The StandardWeightings enumeration provides a selection of built-in weighting presets for fuzzy matching operations. Each preset is designed to optimize character-level edit costs for specific types of data, such as numeric fields, currency values, or label text. By selecting a preset, users can quickly apply a set of proven weightings to improve fuzzy match accuracy for typical OCR, data entry, or document processing scenarios—without the need to manually define every weighting rule.

How Presets Work

When a preset is selected, the Fuzzy Match Weightings engine loads a predefined set of swap, insert, delete, and immutable character rules tailored for the chosen data type. These rules are merged with any custom entries you define in the local lexicon, allowing for further customization as needed.

  • Presets are especially useful for handling common OCR errors, such as confusing 'O' and '0', or punctuation in numbers.
  • You can use a preset as a starting point and then add or override specific rules for your unique data or document type.
  • If None is selected, only custom weightings defined in the local lexicon (or included lexicons) are used.

Choosing a Preset

  • Use Full for general-purpose data extraction where both numeric and alphabetic errors are common.
  • Use Numeric for fields that contain only numbers, such as IDs or codes.
  • Use Currency for monetary values, where punctuation and currency symbols are frequently misrecognized.
  • Use Labels for label or field name extraction, where both numeric and alphabetic confusion is possible.

Example

To optimize fuzzy matching for a field that contains currency values, select the Currency preset. This will apply special rules for punctuation and currency symbols, in addition to numeric confusion rules.

For more information on customizing fuzzy match weightings, see the documentation for Fuzzy Match Weightings.

Can be one of the following values:

NameValueDescription

Used By

Notification