Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

EPI Separation

Extractor Based Provider Grooper.Capture

Performs document separation using Embedded Page Information (EPI) extracted from each Batch Page.

Remarks

Overview

The EPI Separator class implements separation logic that uses page numbers and optional page counts embedded in document content to determine document boundaries. This approach is ideal for document sets where each page contains a printed or extracted page number (and optionally a total page count), such as "Page 1 of 4".


Purpose and Role

  • EPI-Driven Separation:
    Automates the process of splitting Batch Pages into documents based on extracted page numbers and page counts.
  • Sequence Validation:
    Ensures that pages are grouped into documents only if their page numbers are in sequence, reducing the risk of mis-separation due to missing or out-of-order pages.
  • Loose Page Handling:
    Pages with missing or out-of-sequence EPI are left as loose pages for manual review or custom handling.

Usage Scenarios

  • Multi-Page Forms:
    Separate documents where each form is printed with sequential page numbers, such as "Page 1 of 3", "Page 2 of 3", etc.
  • Faxed or Scanned Batches:
    Use EPI to reconstruct original documents from a continuous stream of scanned or faxed pages.
  • Quality Assurance:
    Identify and flag loose pages when EPI is missing or out of sequence, supporting robust QA workflows.

Best Practices

  • Configure the Value Extractor to reliably extract both the page number (group "PageNo") and, if available, the page count (group "PageCount") from each page.
  • Ensure that the extractor pattern matches the format used in your documents (e.g., "Page (?<PageNo>\d+) of (?<PageCount>\d+)").
  • Test separation logic on representative batches to verify correct grouping and handling of edge cases.

Properties

NameTypeDescription

See Also

Used By

Notification