Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

OCR Engine

Embedded Object Grooper.OCR

OCR Engines extract text from images using optical character recognition.

Remarks

An OCR Engine in Grooper is a component that converts images containing text—such as scanned documents, photographs, or faxes—into machine-readable text data.
OCR engines are essential for automating data extraction, document classification, and searchability in document-centric workflows.

How OCR Engines Are Used in Grooper

  • OCR Engines are configured as part of an OCR Profile, which defines how text extraction is performed on images.
  • They are used in activities such as document ingestion, classification, data extraction, and validation.
  • The engine processes an image and returns an OCR Results object containing recognized text, layout, and confidence information.
  • OCR Engines can be selected and configured per batch process, document type, or workflow step, allowing for flexible and optimized recognition strategies.

Examples of OCR Engines

Grooper supports multiple OCR engines, each with unique capabilities and configuration options.
Common examples include:

  • Azure OCR:
    Integrates with Microsoft Azure Cognitive Services to provide cloud-based OCR.
    Useful for high-accuracy recognition, multi-language support, and scalable processing.

  • Transym OCR 4:
    A high-performance, on-premises OCR engine known for speed and accuracy on structured documents.
    Supports advanced features such as orientation detection and confidence scoring.

  • Tesseract OCR:
    An open-source OCR engine suitable for a wide range of document types and languages.
    Offers flexibility and is widely used for general-purpose OCR tasks.

Derived Types

There are 5 implementations of OCR Engine.

Azure OCR Recognizes machine print and handprint using the Microsoft Azure Computer Vision API.
Layered OCR Performs OCR using multiple OCR Layers, merging the results into a single output document.
Tesseract OCR Provides an OCR engine for Grooper using the open-source Tesseract library.
Transym OCR 4 Provides an OCR engine for Grooper using the commercial Transym OCR 4 library.
Transym OCR 5 Transym OCR 5 is a commercial OCR engine that provides highly accurate multi-language OCR for machine-printed documents.

Used By

Notification