Product Discovery Collection

A Product Discovery Collection is used to index information from a product dataset. In simple terms, each product in your dataset becomes one document in the collection.

Creating Product Discovery Collection

Follow these steps to create a new Product Discovery Collection:

  1. Log in to the Admin Console
  2. Go to the Collections tab
  3. Click "Create a New Collection" or the "+" button
  4. Choose "Product Discovery Collection" as the type
  5. Enter a unique name for your collection (e.g., "ecommerce")
  6. Choose whether the collection should be Private or Public
  7. Set the Encryption level based on your security needs
  8. Select the content language (if not English)
  9. Click "Save" to finish creating your collection

  • After you create the Product Discovery collection, you will be automatically taken to the Data Settings tab..

Data Settings

  • In the Data Settings tab, you can import test data in three ways: Platforms a ecommerce platform selection Upload a product file or Connect to CDATA Connectors to pull records.


  • `For Product Discovery Collection Data Settings, you need a product data file and a unique field inside that file.

  • To upload your product data file, click the Upload button. You can upload a CSV or JSON file.

  • After you upload a CSV file, the file path and other details will appear as shown in the image.

  • If you upload a JSON file, the folder path and unique field will appear after upload, as shown in the image.

  • Click Save to store your settings.

  • After saving, you can start indexing the uploaded CSV or JSON file.

  • The table below shows all the settings available in the Product Discovery collection.

FieldDescription
Folder PathThe place inside the SearchBlox folder where your uploaded CSV/JSON file is stored, which can be done by uploading the file.
Unique FieldThe CSV column or JSON attribute that has unique values in every record. This helps in indexing and searching.
Field SeparatorThe symbol used to separate values in a CSV file. (Default: ,)
Escape CharacterThe character used to handle special characters in the file. (Default: ;)
Quote CharacterThe character used to wrap text values. (Default: ’)
Use first record as headerSelect this if the first row of the CSV file should be used as the header.
  • If you choose Connect, select the database type, enter the database URL, and provide the query to pull records from the database.


FieldDescription
Database TypeChoose the type from the drop-down list. By default it's Shopify.
Database URL StringCDATA Connection String.
Example : jdbc:shopify:AppId=xyxcvg;Password=scab6d19db4c47769f3240d;ShopUrl=<https://xyz1-.myshopify.com>;
SQL QueryQuery to fetch the records from database.

Settings

Index related settings for Product discovery collection is shown in this tab.

FieldDescription
Relevance - Remove DuplicatePrevents indexing of documents that have the exact same content. Default: NO
Relevance - StemmingMatches different word forms of the same root word (e.g., run, running, ran). Default: YES
Relevance - Spelling SuggestionsGives spelling suggestions during search. Default: YES
Keyword-in-Context DisplayShows search results with snippets taken from the part of the content where the keyword appears.
Enable Detailed Log SettingsWhen debug mode is on, detailed indexing logs are saved in index.log, including status, timestamps, and indexing time. Default: NO
Enable Content APIAllows indexing of document content that includes special characters.

Synonyms

Synonyms help find related documents even when the exact search term isn’t used. For example, if someone searches for “global,” results containing “world” or “international” will also appear.

You can also load synonyms from your existing collections.

Schedule and Index

You can set how often indexing should run and the start date/time for indexing the collection based on the selected folder path. SearchBlox supports the following schedule frequencies:

  1. Once
  2. Hourly
  3. Daily
  4. Every 48 Hours
  5. Every 96 Hours
  6. Weekly
  7. Monthly

The Product Discovery collection allows you to perform the following operations.

Enable Scheduler for IndexingTurn this on to set when indexing should start and how often it should run.
SaveSaves your scheduler settings for the collection so indexing can run automatically.
View all SchedulesOpens the Schedules section where you can see all indexing schedules for all collections.


Models

Embedding

  • Provider specifies the embedding provider used to generate vector representations of documents.
  • Model defines the embedding model used to convert document content into vectors for semantic search.

Reranker

  • Provider specifies the reranker provider used for improving search result relevance.
  • Model defines the reranker model used to re-score and reorder search results based on relevance.

LLM

  • Provider specifies the Large Language Model provider used for AI-powered features.

  • Model defines the LLM used for tasks such as document enrichment, summaries, and SmartFAQs.

  • These settings override global configurations and apply only to the current collection.

Viewing Search Results for Product Discovery Collections