Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Extract Page

IP Command Grooper.IP

Extracts a page from a carrier image by detecting its edges and de-warping the result.

Remarks

The Extract Page command is designed to locate and extract a document page from a larger carrier image, such as a scan of a page on a flatbed scanner or a camera-captured image. It works by detecting the four edges of the page, even if the page is skewed, sheared, or subject to perspective distortion, and then applies a Warp operation to produce a new, de-warped image of the page.

This command is especially useful in scenarios where pages are scanned or photographed on a contrasting background, or where the page outline is visible but not perfectly aligned with the image axes. Each edge is detected independently, allowing for robust extraction even when the page is not perfectly rectangular in the image.

Supported Scenarios and Examples

Light Page on Dark Background

In this scenario, the page is placed on a black or significantly darker background. The algorithm detects the strong contrast at the page edges to accurately locate the quadrilateral region of the page.

Light Page on Light Background

In this case, the page is on a background similar in color to the page itself, but the outline is still visible. The algorithm detects subtle edge features to extract the page region.

How Extract Page Works

  1. The input image is binarized using the configured 'Binarization' settings to enhance edge contrast.
  2. Edge detection is performed in the border regions of the image to locate the top, right, bottom, and left page edges.
  3. Each edge is detected independently, allowing for accurate extraction even if the page is skewed, rotated, or subject to perspective distortion.
  4. The four detected edges define a quadrilateral, which is then de-warped using a Warp operation to produce a new, rectangular image of the page.

Configuration and Usage

  • Use the 'Binarization' property to tune how the image is converted to black and white. For dark backgrounds, aim for a solid black border and a mostly white page. For light backgrounds, ensure page edges are visible as lines.
  • Adjust the 'Border Size' to control the region where edge detection is performed. Larger values may help with pages that are not centered or have wide margins.
  • Use the 'Angle Precision' and 'Threshold' properties to fine-tune edge detection sensitivity and accuracy.
  • Review diagnostic images such as 'Binarized', 'Edges', and 'Zoning' to verify and optimize detection results.

Supported Pixel Formats

All common pixel formats are supported, including Pixel8bppGrayscale, Pixel24bppBgr, and Pixel1bppIndexed. Images are automatically converted as needed for edge detection and warping.

Diagnostics

When run in diagnostic mode, Extract Page generates output images showing the binarized input, detected edges, and the quadrilateral region used for extraction. These diagnostics are essential for tuning the command for your specific document types.

Properties

NameTypeDescription
General
Line Detection
Warp Settings
Command Info

See Also

Used By

Notification