Grooper Help - Version 25.0
25.0.0024 2,166

Correct

Code Activity Grooper.Activities

Performs spell correction and word splitting on OCR text or extracted data fields.

Remarks

The Correct activity enables automated correction of OCR errors and normalization of text content in Grooper. It can operate on the full text of a document or on the values of specific Data Fields, depending on the selected 'Scope'. This activity is typically used after running the Recognize or Extract activities to improve data quality and consistency.

How It Works

  • Spell correction is performed by running a configured extractor over the text and replacing or removing matched segments.
  • Word splitting analyzes long, unbroken text segments and splits them into individual words using a vocabulary and gap analysis.
  • Both correction and splitting can be enabled independently or together.
  • The activity supports advanced scenarios such as search-and-replace, abbreviation, and normalization of field values.

Configuration Guidance

  • Set 'Scope' to control whether the activity operates on the document's full text or on specific fields.
  • Enable 'Spell Correction' and configure the appropriate extractors and options for your use case.
  • Enable 'Word Splitting' to correct word spacing issues, and provide a vocabulary for accurate splitting.
  • Adjust thresholds and vocabulary settings to balance accuracy and performance.

This activity is essential for improving the quality of OCR data and ensuring that extracted values are accurate and standardized before downstream processing or export.

Properties

NameTypeDescription
General
Spell Correction
Word Splitting
Processing Options

See Also

Used By

Notification