Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Image Segmentation Options

Embedded Object Grooper.OCR

Configures how images are segmented into distinct regions for independent OCR processing.

Remarks

The Image Segmentation Options class provides advanced controls for dividing an image into separate regions—such as boxes, fields, or columns—so that each can be processed independently by OCR. This is especially useful for structured documents like forms, tables, or documents with boxed layouts, where text is often enclosed or separated by lines.

How Image Segmentation Works

When enabled, segmentation analyzes the image to detect regions that are visually bounded (e.g., by lines or whitespace). Each detected region is then processed as a separate OCR operation, which can improve recognition accuracy and prevent text from different regions from being merged incorrectly.

  • Regions are identified based on boundaries, size, and area, using the configured options.
  • Segmentation is typically used in conjunction with an OCR Profile, and can be enabled via the 'Image Segmentation' property.
  • Detected regions are processed independently, and their results are merged into the final OCR output.

Configuration Guidance

  • Minimum Size / Area: Use 'Minimum Size' and 'Minimum Area' to filter out small or irrelevant regions.
  • Maximum Height / Width Ratio: Limit the size of detected regions to avoid merging unrelated content.
  • Maximum Merge Height: Controls when adjacent regions are merged horizontally, useful for multi-line fields or table rows.
  • Adjust these options based on the typical structure of your documents for optimal results.

Usage Scenarios

  • Forms and Tables: Segmenting boxed fields or table cells to ensure each is recognized separately.
  • Documents with Columns: Preventing text from different columns or regions from being combined.
  • Complex Layouts: Handling documents with a mix of text, graphics, and lines.

Best Practices

  • Enable segmentation for documents where text is frequently enclosed or separated by lines.
  • Use diagnostic tools to visualize detected regions and fine-tune segmentation settings.
  • Combine segmentation with synthesis and filtering options in your OCR Profile for best results on structured documents.
  • Test on representative samples to ensure regions are detected and processed as intended.

Properties

NameTypeDescription

Used By

Notification