Grooper Help - Version 25.0
25.0.0040 2,257

VLM OCR

Code Activity Grooper.GPT

Analyzes pages using a Vision Language Model (VLM) and saves structured JSON analysis for downstream data extraction.

Remarks

Purpose:
VLM_OCR runs a multimodal VLM against a single page, sending image content, a JSON schema, and instructions to a configured . The resulting structured JSON is saved to the processed item for use by other Grooper components.
How it works:

  • Prepares an image stream for the page and passes it, along with and , to the .
  • Receives a JSON object containing the VLM analysis results.
  • Saves the result to the processed item using (default: VLMAnalyze_{Model}.json).
  • The flag controls whether existing results are replaced.
  • If is provided, it is added to diagnostics as Schema.json. Integration:
    The output JSON file can be consumed by the JSON File quoting method for prompt injection in downstream LLM operations. Use ContentSelector (JSONPath) to select nodes for quoting, and RemovalSelector to redact sensitive data.
    Configuration:
  • Provide a matching the expected VLM output structure.
  • Set to balance quality and throughput.
  • Use or rely on the default for consistent file references.
  • For per-page quoting, run VLM_OCR on pages and set quoting methods accordingly. Diagnostics:
  • Adds Schema.json to diagnostics if is set.
  • Includes converted response JSON and annotated images for troubleshooting. Notes:
  • The JSON File quoting method expects the JSON file on the same Batch Object.
  • Use clear JSONPath expressions to avoid injecting irrelevant data.
  • Use RemovalSelector to redact private data before quoting. Properties:
    : The used for VLM analysis.: Instructions for the VLM.: Defines the expected output structure.: JSONPath expressions for coordinate normalization.: Overwrite existing results if true.: Image resolution for analysis.: Optional custom output filename.: The actual output filename.

Properties

NameTypeDescription
General
Options
Custom File Name String

An optional custom file name for saving analysis data.

File Name String

The name of the file where the analysis data will be saved.

Processing Options

See Also

Used By

Notification