Grooper Help - Version 25.0
25.0.0040 2,257

Azure DI OCR

OCR Engine Grooper.Cloud

Recognizes machine print and hand print using the Azure Document Intelligence service.

Remarks

This OCR engine uses Azure Document Intelligence in combination with Grooper's internal OCR engines to generate detailed and highly accurate OCR results. It leverages the high accuracy of AI-powered recognition without sacrificing the detailed character-level positional data captured by traditional OCR engines.

Usage

Azure DI OCR is typically used in conjunction with the DI Analyze activity. In a batch processing workflow, the DI Analyze activity is executed first - either at the document or page level. This performs the Azure analysis and saves the results to the document or page as a JSON file.

For most use cases, the model used for analysis should be prebuilt-layout. This provides the word recogniton
necessary for Azure DI OCR to generate character-level OCR results, as well as the paragraph and table data needed for injecting HTML / markdown into data extraction operations. (See the DI Layout quoting method for more information on injecting layout data.)

, which includes paragraph, tables, figures and other layout information. This model provides the necessary data for Azure DI OCR to perform character-level recognition during the subsequent Recognize activity.

After DI Analyze, a Recognize activity runs the Azure DI OCR engine, which uses the stored Azure DI analysis data combined with traditional OCR engine results to generate character-level OCR data.

Standalone Use as OCR Engine

Azure DI OCR can be used as a stand-alone OCR engine. If no existing Azure data is found on the page or document, the engine will send the image to Azure and retrieve OCR results in real time.

In this mode, use the prebuilt-read model, as this model represents the low-cost "OCR-only" option.

Properties

NameTypeDescription

See Also

Used By

Notification