Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Multi-Column

Collation Provider Grooper.Extract

Output a single instance where the document has been reformatted to reflect the flow of a multi-column document.

Remarks

The MultiColumnProvider collation provider reconstructs the reading order of documents with multiple columns, outputting a single instance that reflects the intended flow of text across columns.

How it works:

  • Analyzes the spatial arrangement of extracted instances to detect columns based on configurable gap and size thresholds.
  • Reorders and merges content from detected columns and normal (single-column) regions to produce a single, linear output instance.

Configuration:

  • The properties 'MinimumGap', 'MaximumGap', 'MinimumHeight', 'MinimumColumnWidth', 'MinimumParity', and 'MinimumLines' control how columns are detected and how content is merged.

Use cases:

  • Extracting and reconstructing the logical reading order from documents with two or more columns, such as newspapers, reports, or forms with side-by-side content.
  • Ensuring that extracted data reflects the intended flow for downstream processing or export.

Properties

NameTypeDescription

Used By

Notification