Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Extract

Code Activity Grooper.Activities

Performs data extraction on a document and saves the resulting Document Instance.

Remarks

The Extraction activity loads a document's character and Layout Data into a Document Instance, then executes extraction logic defined in the associated Data Model. This process populates the Document Instance with Section Instances, Table Instances, and Field Instances according to the Data Sections, Data Tables, and Data Fields defined in the Data Model.

Overview

Workflow Placement

  • This activity should run after Recognize, ensuring the document has character and Layout Data.
  • Documents should have a Content Type assigned before extraction. If not, set the 'Default Content Type' property to provide a fallback.
  • If the Classify activity is used to assign content types, it should run before extraction.

Multi-Type Extraction

  • If a document has multiple Content Types assigned (primary and secondary), extraction will run for all unless a 'Content Type Filter' is specified.
  • Use 'Content Type Filter' to restrict extraction to specific types or their descendants.

Selective Extraction

  • Use 'Data Element Filter' to extract only specific Data Elements, preserving values for all others.
  • Use 'Rules' to apply post-processing logic, such as normalization, transformation, or validation, after extraction.

Post-Processing Options

  • 'Flag Invalid Items' can flag folders with validation errors in extracted data.
  • 'Purge Alternate Candidates' removes alternate field value candidates before saving.
  • 'Purge Empty Fields' removes empty fields before saving.
  • 'Stats Logging' controls the level of extraction statistics recorded.

Example Scenarios

  • Standard Extraction: Run after Recognize and Classify to populate data for review and export.
  • Fallback Classification: Use 'Default Content Type' to ensure extraction proceeds even if classification is missing.
  • Selective Extraction: Use 'Content Type Filter' and 'Data Element Filter' to extract only relevant data for a specific process.
  • Data Normalization: Apply Data Rules to standardize or validate extracted data before export.

For more information, see Data Model, Content Type, Value Extractor, and Data Rule.

Properties

NameTypeDescription
General
Post Processing
Processing Options

See Also

Used By

Notification