Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Section Extract Method

Embedded Object Grooper.Core

Defines an abstract extraction method for use with Data Section objects, enabling flexible subdivision of document content into section instances.

Remarks

The Section Extract Method class serves as the foundation for all extraction strategies used by Data Section objects in Grooper. It provides the interface and base functionality for identifying and extracting logical sections from document content, supporting a wide range of document structures and extraction scenarios.

How It Works

When a Data Section is configured, a specific Section Extract Method is assigned to control how section instances are located and extracted during the Extract activity. Derived classes implement concrete extraction logic, such as delimiter-based, pattern-based, or positional extraction, to match the structure of the target document.

The extraction method is responsible for analyzing the document content, identifying the boundaries of each section, and returning a collection of section instances for further processing and validation.

Configuration and Usage

  • Assign a Section Extract Method to each Data Section to match the layout and requirements of your documents.
  • Choose or implement a method that best fits the way sections are organized (e.g., by headers, patterns, or fixed positions).
  • The selected method will be used automatically during extraction, and can be extended or customized by deriving from this base class.

Notes

  • Section Extract Method is an abstract class; use one of its concrete implementations for actual extraction.
  • Proper selection and configuration of the extraction method is critical for accurate section detection and data extraction.
  • For more information, see the documentation for Data Section, Section Instance, and related extraction classes.

Derived Types

There are 11 implementations of Section Extract Method.

AI Collection Reader Extracts a Section Instance Collection from a document using generative AI.
AI Section Reader Extracts a Section Instance from a document using generative AI.
AI Transaction Detection Identifies the boundaries between transactions in a document.
Clause Detection Detects and extracts clauses in natural language documents using semantic similarity.
Divider Splits input content into section instances at each occurrence of a specified text pattern.
Fixed Identifies section instances using a fixed rectangular region on a specific page.
Full Page Creates a section instance from one or more pages in a document.
Geometric Defines a rectangular region using anchors and extracts the character data bounded by the region as a Section Instance.
Nested Table Splits a hierarchical table into sections for extraction as Section Instances.
Simple Identifies section instances by matching contiguous segments of text using a configured extractor.
Transaction Detection Detects periodic transactions in a document and generates a Section Instance for each transaction.

Used By

Notification