Using Remove Duplicates

Remove Duplicates

The Remove Duplicates feature ensures that pages with the same content are not indexed multiple times, which helps improve search efficiency and result quality.

When enabled, the system will index only one instance of pages that have 100% identical content, ignoring other duplicates.

By default, this feature is disabled in Collection Settings. Users must enable it manually to filter out duplicate content during indexing.

The content comparison checks the following:

  • Page title – the main heading or title of the page
  • Keywords – meta keywords specified for the page
  • Description – meta description of the page
  • All meta fields – any other custom or predefined meta fields
  • Main page content – the actual text and content of the page

Enabling this feature produces cleaner search results, removes repetitive entries, and helps users find unique content more easily.