Grooper Help - Version 25.0
25.0.0017 2,127
  • Overview
  • Help Status

Deduplicate

Code Activity Grooper.Activities

Detects and processes duplicate child documents using configurable comparison and disposition options.

Remarks

The Deduplicate activity identifies groups of duplicate documents within the current processing scope and applies a user-configurable action to each group. Duplicates are detected by comparing documents using the selected 'Compare By' mode, which can be based on text similarity or data values. Once duplicate groups are identified, the activity processes each group according to the selected 'Disposition', such as moving, deleting, flagging, or stacking duplicates.

How It Works

  • All documents in the current scope are compared to each other using the configured 'Compare By' mode.
  • The 'Minimum Similarity' property controls the threshold for considering two documents as duplicates when using text comparison.
  • Each group of detected duplicates is processed according to the selected 'Disposition', which determines what happens to the duplicates (e.g., move, delete, flag, or append).
  • When using the 'Move' disposition, you can specify a target content type for the folder that will contain the duplicates.

Configuration Guidance

  • Choose the appropriate 'Compare By' mode based on your document set and processing requirements.
  • Adjust 'Minimum Similarity' to control the strictness of duplicate detection when comparing by text.
  • Select the desired 'Disposition' to determine how duplicates are handled in your workflow.
  • If using 'Move', configure the 'Target Folder Content Type' to control the type of folder created for duplicates.

This activity is typically used in batch processing scenarios to ensure that only unique documents are retained, or to organize duplicates for review or downstream processing.

Properties

NameTypeDescription
General
Processing Options

See Also

Used By

Notification