Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Split

Collation Provider Grooper.Extract

Splits the input at each match found by an extractor.

Remarks

The SplitProvider collation provider divides the input into multiple segments based on the positions of matches found by a child extractor.

How it works:

  • For each match found by the extractor, the input is split according to the selected mode.
  • The resulting segments are returned as separate output instances.

Split modes:

  • Begin: Splits at the beginning of each match. Each segment starts at a match and ends at the next match or the end of input.
  • End: Splits at the end of each match. Each segment starts at the previous end and ends at the end of a match.
  • Around: Splits around each match, excluding the matched content. The number of segments is always the number of matches plus one.
  • Between: Splits between matches, returning only the content between each pair of matches. At least two matches are required for output.

Use cases:

  • Extracting paragraphs, sections, or records separated by a specific pattern (such as headings, page breaks, or delimiters).
  • Isolating content between repeated markers or extracting all content except for matched regions.

Example: Suppose you have a document with section headers like "Section 1", "Section 2", etc. By configuring an extractor to match these headers and using SplitProvider with the "Begin" mode, you can split the document into sections at each header.

Properties

NameTypeDescription

Used By

Notification