Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Text Document - Split

Text Document Command Grooper.Extract

Splits a text document into smaller documents, using an extractor to identify split positions within the text content.

Remarks

The Split command divides a plain text file into multiple child documents within a Batch Folder, based on configurable rules.

Overview

  • Supports splitting by a fixed number of lines or by detecting start tags using a Value Extractor.
  • Allows skipping header lines, excluding regions, and assigning a Content Type to each child document.
  • Handles various text encodings and byte order marks (BOM).

Workflow

  1. Optionally skip a specified number of header lines.
  2. For each line:
    • If an exclusion tag is detected, ignore lines until the next start tag.
    • If a start tag is detected (or a line count is reached), split and create a new child document.
    • Optionally, split at an offset above the start tag.
  3. Assign the specified Content Type to each child document, if configured.

Configuration

  • Use LinesPerDocument for fixed-size splits, or StartTag to split based on content.
  • Use LinesToSkip to skip header lines.
  • Use ExclusionTag to ignore regions of the document.
  • Use StartOffset to split above the detected start tag.
  • Set AssignContentType to classify child documents.
  • Configure Encoding and DetectBOM for correct text interpretation.

Usage Notes

  • Ideal for processing reports, logs, or multi-record text files.
  • Combine with Content Type and Data Model for downstream extraction.

Properties

NameTypeDescription
General
Encoding

See Also

Notification