Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Extractor Builder

Control GrooperReview.Pages.Design

Provides an interactive user interface for building, configuring, and testing extractors within the Grooper web client.

Remarks

The Extractor Builder is a comprehensive UI component for designing, validating, and refining extraction logic for Extractor Nodes, Value Extractors, and Collation Providers in Grooper. It brings together property editing, test source selection, document viewing, result review, diagnostics, and AI-assisted authoring in a single, integrated workspace. The control is designed for both novice and advanced users, supporting rapid iteration and immediate feedback as extractor settings are changed.

Purpose and Functionality

  • Extractor Configuration:
    Edit extractor properties (such as regular expressions, options, and settings) using a categorized property grid with contextual help and validation.
  • Test Source Selection:
    Choose the document or data instance to use as input for extraction tests, using a tree viewer and action buttons.
  • Real-Time Testing:
    Run extraction tests automatically or on demand as extractor settings are modified, with results updated in real time.
  • Result Review:
    View all matches or hits produced by the extractor in a sortable, filterable result list. Select results to review details, highlight matches in the document, and inspect extraction metadata.
  • Document Viewer Integration:
    Preview the currently-selected document, with support for zoom, navigation, and visual verification of extraction results. Selecting a result highlights the corresponding region in the document.
  • Paging and Navigation:
    Use the page navigator to browse through large sets of extraction results efficiently.
  • Diagnostics and Troubleshooting:
    Access detailed diagnostics, error messages, and performance metrics for each extraction test.
  • AI-Assisted Authoring:
    Use the AI-powered build button to generate or modify regular expressions from natural language instructions (when enabled).
  • Advanced Actions:
    Train results as positive or negative examples, visualize regular expressions, inspect data in detail, and define context zones interactively.

UI Components

  • Toolbar:
    • Action buttons for testing, toggling auto-test, viewing diagnostics, launching the Data Inspector, visualizing regular expressions, training, and more.
    • AI build button for natural language regex authoring (if enabled).
  • Tab List:
    • Switch between "Expressions" and "Properties" views for extractor configuration.
  • Property Grid:
    • Edit extractor properties with contextual help and validation.
  • Test Source Panel:
    • Select the document or data instance to use as input for extraction tests.
  • Document Viewer:
    • Preview the selected document and highlight extraction results.
  • Result List:
    • Review all extraction results, with support for selection, sorting, and filtering.
  • Page Navigator:
    • Navigate through multiple pages of extraction results.

Interactive Features

  • Automatic and Manual Testing:
    • Enable or disable auto-test mode. When enabled, tests run automatically after each change; otherwise, use the test button.
  • Result Selection and Highlighting:
    • Selecting a result in the list highlights the corresponding region in the document viewer.
  • Diagnostics and Inspection:
    • View detailed diagnostics for the last test, or launch the Data Inspector for in-depth review.
  • Training and Visualization:
    • Mark results as positive/negative for training, or visualize regular expressions and extraction logic.
  • Context Zone Definition:
    • Use the rubberband tool to define context zones by drawing rectangles on the document image.
  • AI Regex Generation:
    • Use the build button to generate or modify regular expressions from natural language instructions (if AI tools are enabled).

Example

The following diagram illustrates the layout of the Extractor Builder:

┌────────────────────────────────────────────────────────────────────────────────────────────┐
│ Toolbar: {build} {test} {toggle} {diagnostics}                                             │
├────────────────────────────────────────────────────────────────────────────────────────────┤
│ ┌───────────── Tab List ─────────────┐ ┌──────────────── Document Viewer ────────────────┐ │
│ │ {Expressions} {Properties}         │ │                                                 │ │
│ │------------------------------------│ │                                                 │ │
│ │  Expression Grid or Property Grid  │ │                                                 │ │
│ │                                    │ └─────────────────────────────────────────────────┘ │
│ └────────────────────────────────────┘ ┌────────────────── Result List ──────────────────┐ │
│ ┌───────────── Test Source ──────────┐ │ {Pager} {inspect} {visualize} {weightings}      │ │
│ │                                    │ │ {train_positive} {train_negative} {rubberband}  │ │
│ └────────────────────────────────────┘ └─────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────────────────────────────┘
  • {build}: AI regex authoring (if enabled)
  • {test}: Run extraction test
  • {toggle}: Enable/disable auto-test
  • {diagnostics}: View diagnostics for last test
  • {Expression Grid}: Edit regular expressions
  • Property Grid: Edit extractor properties
  • Test Source: Select test document or data instance
  • Document Viewer: Preview and highlight extraction results
  • {Pager}: Navigate result pages using a Page Navigator.
  • {inspect}: Launch Data Inspector
  • {visualize}: Visualize regex/extraction logic
  • {weightings}: View scoring details (Field Class only)
  • {train_positive}/{train_negative}: Train result as positive/negative (Field Class only)
  • {rubberband}: Define context zone (Field Class only)
  • Result List: Review and select extraction results

Usage Tips

  • Use the property grid and tab list to configure extractor logic and options.
  • Select a test source document to validate extraction patterns in real time.
  • Enable auto-test for rapid iteration, or use the test button for manual control.
  • Review results in the result list and use the pager for large result sets.
  • Use the document viewer to visually confirm extraction accuracy and context.
  • Access diagnostics, training, and visualization tools for advanced troubleshooting and optimization.
  • Use the AI build button to generate or refine regular expressions from natural language instructions (if available).

The Extractor Builder streamlines the process of developing, validating, and optimizing extraction logic in Grooper, providing immediate feedback, advanced diagnostics, and powerful authoring tools for solution designers and data engineers.

Command Buttons

ButtonShortcut KeySummary

Child Controls

NameTypeSummary

Used By

Notification