Grooper Help - Version 25.0
25.0.0024 2,166

Burst Book

Code Activity Grooper.Microform

Extracts individual page images from a photograph of a book, automatically splitting, cropping, and dewarping pages for downstream processing.

Remarks

The Burst Book activity transforms photographs of open books into clean, individual page images suitable for extraction, OCR, and review. It is designed for use in Batch Process workflows where books are photographed rather than scanned, and pages must be separated and corrected for curvature and perspective.

Overview

BurstBook can be executed at either the Batch Folder or Batch Page level:

  • Folder-Level:
    Processes a photograph of an open book (typically showing two pages), splitting it into separate page images. The input must be a photograph of a book on a solid black background. The photograph may show both pages, or one page may be covered with a black sheet.

  • Page-Level:
    Processes a single page photograph, dewarping and cropping the image so the page edges align with the image boundaries. The input should be a photograph of a single book page on a solid black background.

How It Works

BurstBook performs a series of image processing steps to extract high-quality page images:

  1. Preprocessing:
    The input image is resized to the configured resolution and optionally cropped to remove inconsistent borders using the 'Border Crop' property.

  2. Background Detection:
    The black background is analyzed to determine the bounds of the book. The 'Max Background Noise' property controls tolerance for non-black pixels.

  3. Page Splitting:
    For two-page images, the photograph is split into left and right pages. Margin detection and gutter analysis are used to accurately separate pages, even when a visible gutter is present.

  4. Dewarping and Curvature Correction:
    Each page image is dewarped using edge and contour detection, correcting for perspective and page curvature. The 'Edge Threshold' property controls sensitivity for edge detection, and the 'Corner Radius' property can be set to handle rounded page corners.

  5. Output:
    The resulting page images are saved as new Batch Pages (folder-level) or replace the original image (page-level), with resolution and size adjusted to match the configured 'Page Size'.

Configuration Guidance

  • Input Requirements:

    • Use a solid black background for best results.
    • Ensure the book is fully visible and not too close to the image edge.
    • For two-page mode, one page may be covered with a black sheet if only one page should be extracted.
  • Tuning Properties:

    • Adjust 'Resolution' to match the original photograph's DPI.
    • Set 'Page Size' to the physical dimensions of a single book page.
    • Use 'Border Crop' to remove inconsistent or non-black edges.
    • Set 'Corner Radius' if the book pages have rounded corners.
    • Lower 'Edge Threshold' for more sensitive edge detection; increase to ignore faint edges.
    • Increase 'Max Background Noise' if the background is not perfectly black.
  • Diagnostics:
    Enable diagnostic mode to review intermediate images and logs, which assist in tuning and troubleshooting.

Typical Workflow

  1. Photograph the book on a solid black background.
  2. Attach the photograph to a Batch Folder or Batch Page.
  3. Configure BurstBook properties as needed.
  4. Run the activity at the folder or page level.
  5. Review diagnostic images to verify page extraction and dewarping.

Diagnostic Artifacts

When diagnostics are enabled, BurstBook generates the following artifacts to assist with configuration and troubleshooting:

  • Border Crop:
    Shows the effect of the 'Border Crop' property on the input image.
  • Crop Profile:
    Displays the projection profile used for background detection and cropping.
  • Bounds:
    Highlights the detected book bounds on the original image.
  • Cropped:
    Shows the image after cropping to the detected book area.
  • Page Extraction:
    Logs the timing and steps of page extraction.
  • Margin Detection:
    Visualizes detected margins used for splitting two-page images.
  • Gutter:
    Annotates the detected gutter (if present) between pages.
  • Warp1, WarpY, WarpX:
    Show intermediate and final dewarped images during curvature correction.
  • Contours Y:
    Displays the detected top and bottom contours of the page(s).
  • Filled:
    Shows the binarized image with gaps filled for contour detection.
  • Output image(s):
    The final extracted and dewarped page images.

Review these artifacts in the diagnostics viewer to fine-tune property values and ensure optimal extraction results.

Usage Notes

  • BurstBook is intended for use with photographs, not scanned images.
  • For best results, use consistent lighting and avoid shadows or reflections.
  • If extraction fails or produces poor results, review diagnostic images and adjust property values as needed.
  • This activity is typically used as part of a Batch Process Step in automated workflows, but can also be run manually for testing and tuning.

Properties

NameTypeDescription
General
Processing Options

See Also

Used By

Notification