Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Divider

Section Extract Method Grooper.Extract

Splits input content into section instances at each occurrence of a specified text pattern.

Remarks

The Divider section extraction method is used to identify and extract repeating sections of information from a document by splitting the input at each match of a configured pattern. This is especially useful for documents where similar records or blocks of data appear sequentially, such as lists of employees, transactions, or other repeated entities.

The pattern used to divide the content is defined by the 'Divider Extractor' property, which is typically configured with a regular expression or other Value Extractor. Each time the pattern is matched, a new section instance is created according to the selected split position.

Example Scenario

Suppose you have a document containing multiple employee records, each beginning with the word EMPLOYEE: at the start of a line. You can configure the Divider to split the document at each occurrence of this marker using Pattern Match as the extractor.

Sample document:
ACTIVE EMPLOYEE REPORT

EMPLOYEE: Doe, Jane SSN: 000-00-0000 PHONE: (000) 000-0000 ADDRESS: 1234 S Main CITY: Anytown STATE: AA ZIP: 00000 DOB: 00/00/0000 EMAIL: janedoe@acme.com DEPARTMENT: IT

EMPLOYEE: Doe, John SSN: 000-00-0000 PHONE: (000) 000-0000 ADDRESS: 2233 N Elm CITY: Anytown STATE: AA ZIP: 00000 DOB: 00/00/0000 EMAIL: johndoe@acme.com DEPARTMENT: AP

EMPLOYEE: Smith, Kim SSN: 000-00-0000 PHONE: (000) 000-0000 ADDRESS: 8888 W 24th CITY: Anytown STATE: AA ZIP: 00000 DOB: 00/00/0000 EMAIL: kimsmith@acme.com DEPARTMENT: Admin

Sample regular expression for the divider: \r\nEMPLOYEE:

Configuration Guidance

  • Set the 'Divider Extractor' property to a Value Extractor that matches the start of each section (e.g., a regular expression for EMPLOYEE:).
  • Use the 'Split Position' property to control whether the divider is included at the start or end of each section, or if only the content between dividers is returned.
  • Optionally, adjust the 'Line Offset' property to move the split point up or down by a specified number of lines relative to each match.

This method is ideal for extracting structured, repeating data from unstructured or semi-structured documents, enabling downstream processing and validation of each section as a discrete record.

Properties

NameTypeDescription

See Also

Used By

Notification