Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Box Removal

Box Detection Grooper.IP

Removes checkboxes from an image and generates Layout Data for OMR extraction.

Remarks

The Box Removal command is designed to detect and remove checkboxes (OMR marks) from document images, typically as part of pre-OCR image cleanup. In addition to removing boxes that may interfere with OCR, this command generates the Layout Data required for optical mark recognition (OMR) workflows using Value Extractors such as Labeled OMR.

How Box Removal Works

  1. The input image is binarized and analyzed to detect rectangular regions matching the expected size and aspect ratio for checkboxes.
  2. Detected boxes are classified as free-standing or welded (attached to other objects, such as lines).
  3. Depending on configuration, either all boxes or only free-standing boxes are selected for removal.
  4. A mask is generated to cover the selected boxes, and the mask is applied to the original image using the chosen dropout method.
  5. Layout Data for all detected boxes is stored for downstream OMR extraction.

Configuration and Usage

  • Add Box Removal to the IP Profile used for pre-OCR image cleanup in an OCR Profile.
  • Use the 'Dropout Method' property to control how masked regions are removed (e.g., fill with white or background color).
  • Use the 'Remove Welded Boxes' property to control whether boxes attached to lines or other objects are removed. For best results, remove only free-standing boxes here and use Line Removal for boxes welded to line structures.
  • Configure the Recognize activity to use the OCR Profile containing this command.

Supported Pixel Formats

All common pixel formats are supported, including Pixel8bppGrayscale, Pixel24bppBgr, and Pixel1bppIndexed. Images are automatically converted as needed for box detection and removal.

Diagnostics

When run in diagnostic mode, Box Removal generates diagnostic images such as Binarized, Dropout Mask, Before Line Repair, and After Line Repair. These help you verify which boxes are being detected, which are being removed, and the effect of line repair when welded boxes are dropped out.

Notes

  • Removing welded boxes before line removal can leave lines broken and interfere with the detection of lines and bound regions. Consider your workflow and document structure when configuring this property.
  • The Layout Data generated by Box Removal is required for downstream OMR extraction and can be inspected using Grooper's layout tools.

Properties

NameTypeDescription
General
Dropout Options
Image Preprocessing
Command Info

See Also

Used By

Notification