Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Visual

Classify Method Grooper.Core

Classifies documents or pages based on their visual appearance using image-based features, without requiring OCR.

Remarks

The Visual classification method assigns Document Types by analyzing the visual characteristics of document images, rather than relying on text extraction or content rules.

Overview

  • Visual classification uses an IP Profile to extract image-based features from each page or document.
  • Features may include layout, graphics, logos, line structures, or other visual patterns detected by the configured IP Profile.
  • This method does not require OCR, making it suitable for image-only documents, poor-quality scans, or real-time classification during scanning.

How It Works

  1. For each document or page, the configured IP Profile is applied to extract a set of visual features.
  2. These features are compared to trained models for each Document Type, measuring visual similarity.
  3. The Document Type with the highest similarity score is assigned to the document or page.
  4. Classification can be performed at the document or page level, supporting both single- and multi-page scenarios.

Configuration Guidance

  • Assign an IP Profile that includes an 'Extract Features' command tailored to the visual elements most distinctive for your document types.
  • Train each Document Type with representative samples to build accurate visual models.
  • Use in environments where text extraction is unreliable, unavailable, or unnecessary.
  • Combine with other classification methods (such as Rules-Based or Lexical) for hybrid solutions, if needed.

When to Use

  • Ideal for classifying forms, templates, or documents with consistent visual layouts but variable or missing text.
  • Useful for real-time document separation and classification during scanning, where speed and independence from OCR are required.
  • Effective for image-only documents, such as faxes, checks, or forms with graphical elements.

Practical Notes

  • The accuracy of visual classification depends on the quality and consistency of the images and the effectiveness of the IP Profile.
  • No OCR is performed, so text content is ignored; only visual features are considered.
  • Ensure the IP Profile contains an 'Extract Features' command, or validation will fail.
  • Visual models must be trained for each Document Type to enable accurate classification.

For more information, see the documentation for IP Profile, Document Type, and training procedures for visual classification.

Properties

NameTypeDescription

See Also

Used By

Notification