Compares documents by content coverage rather than exact hash, so different versions of the same packet cluster together. Detection runs once and persists groups; re-run after large uploads. New documents that join a duplicate group are excluded from retrieval until you promote one as canonical.
Pick a canonical and exclude the rest, or break up a duplicate group.
Each group is a cluster of documents that share substantial content. Promote one document as canonical to make it the version retrieval and research will return; the others stay in the collection but are excluded from query results. Disband the group entirely if the dedup engine got it wrong.
# Promote a specific doc as the canonical for its group
dewey duplicates resolve grp_xyz doc_a1
# Or break up a mis-clustered group
dewey duplicates dismiss grp_xyz