Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Transym OCR 5

Transym OCR Engine Grooper.OCR

Transym OCR 5 is a commercial OCR engine that provides highly accurate multi-language OCR for machine-printed documents.

Remarks

Transym OCR 5 is a high-performance, on-premises OCR engine fully integrated with Grooper.
It is designed for full-text and zonal OCR of documents containing machine print, supporting a wide range of languages and advanced image cleanup options.

Key Features

  • High Accuracy:
    Delivers reliable recognition for machine-printed text across 28 supported languages.
  • Multi-Language Support:
    Allows selection of one or more languages for each OCR operation, with automatic language detection and localization.
  • Advanced Image Cleanup:
    Includes options for deskew, deshade, noise removal, line removal, and inversion (both zonal and whole-page).
  • Character Set Control:
    Supports configuration of base character sets, whitelists, and blacklists to optimize recognition for specific document types.
  • Document Structure Detection:
    Can perform sectioning to identify and process document regions more effectively.
  • Orientation Detection:
    Automatically detects and corrects page orientation when enabled.
  • Performance Tuning:
    Offers speed/accuracy trade-offs and lexicon modes to balance throughput and recognition quality.

Usage in Grooper

  • Transym OCR 5 is fully installed with Grooper and does not require additional licensing steps.
  • It is selected and configured within an OCR Profile or as part of a batch process.
  • Suitable for high-volume, on-premises processing where accuracy and control are critical.
  • Not intended for handwriting recognition; best used with machine-printed documents.

Configuration Guidance

  • Use the 'Allowed Languages' property to specify which languages to recognize.
  • Adjust image cleanup options (such as 'Deskew', 'Deshade', 'Noise Removal') to improve results on challenging images.
  • Fine-tune the character set using 'Base Character Set', 'White List', and 'Black List' for specialized documents.
  • Enable 'Perform Sectioning' for documents with complex layouts or multiple regions.
  • Use 'Accuracy Level' and 'Lexicon Mode' to balance speed and recognition quality as needed.

Comparison to Other OCR Engines

  • Transym OCR 5 is an evolution of the original Transym OCR engine, offering improved language support, image cleanup, and configuration flexibility.
  • Compared to engines like Azure OCR or Tesseract OCR, Transym OCR 5 is optimized for on-premises, high-speed, and high-accuracy processing of machine print.
  • For handwriting or cloud-based recognition, consider other engines such as Azure OCR.

Properties

NameTypeDescription
Image Cleanup
Document Structure
Processing Options
Language
Character Set

See Also

Used By

Notification