CSV Collection

CSV collection is used to index records from the CSV file.

Creating CSV Collection

Follow these steps to create a CSV Collection:

  1. Log in to the Admin Console
  2. Go to the Collections tab
  3. Click "Create a New Collection" or the "+" button
  4. Choose "CSV Collection" as the collection type
  5. Enter a unique name for your collection (e.g., "CSV")
  6. Set the Access (Private or Public)
  7. Choose Encryption based on your security needs
  8. Pick the content language (if not English)
  9. Click Save to create the collection

  • After creating the CSV collection, you will be automatically redirected to the CSV Settings tab.

CSV Collection Settings

  • CSV collection settings must be configured manually.

  • The required settings for a CSV collection are:

    • Folder Path
    • Unique Field
  • You must map a unique field from your CSV file. Only when the selected field has unique values, all records in the CSV file will be indexed.

  • SearchBlox also provides some default settings when a new CSV collection is created. These settings can be changed as needed.

  • The table contains all the available CSV collection settings.

FieldDescriptioin
Folder PathLocation of your CSV files. You can upload files or provide the direct folder path.
Field SeparatorThe character that separates values in the CSV. Default is a comma (,).
Escape CharacterCharacter used to escape special characters. Default is ;.
Quote CharacterCharacter used to wrap text values. Default is a single quote (').
Use first record as headerEnable this if the first row of your CSV contains column names.
Unique FieldThe column name in the CSV that has unique values for every row. This is required for correct indexing and searching.
Relevance - Remove DuplicatePrevents indexing of duplicate rows that have the same content. Default is NO.
Relevance - StemmingMatches different forms of a word (run, running, ran). Default is YES
Relevance - Spelling SuggestionsGives spelling correction suggestions for searches. Default is YES.
Keyword-in-Context DisplayShows search results with a short description taken from the part of the content where the searched word appears.
Enable Detailed Log SettingsWhen enabled, detailed indexing logs are saved (timestamps, status, time taken). Default is NO.
Enable Content APIAllows indexing of document content that includes special characters.

  • After entering the Folder Path and other fields, or after uploading the CSV file, click the Save button.
  • Once you click Save, a pop-up will appear. In this pop-up, you can either view the saved file using CSV Preview or start indexing by clicking Index.
  • Search-related settings can be found under the Settings tab.

❗️

Important Note:

Note: If the Unique Field values are not unique, the CSV collection results will match the number of records in the CSV file.

Schedule and Index

Sets how often and when indexing should start for the CSV collection based on the folder path. The supported schedule options in SearchBlox are:

  • Once
  • Hourly
  • Daily
  • Every 48 Hours
  • Every 96 Hours
  • Weekly
  • Monthly

The following actions can be performed in a CSV collection.

ActivityDescription
Enable Scheduler for IndexingTurn scheduling and choose the Start Date and Frequency.
SaveSaves the schedule settings for the collection.
View all SchedulesOpens the Schedules page to see all collection schedules.

Viewing Search Results for CSV Collections


Models

The Models page allows you to configure and override AI models used for embeddings, reranking, and LLM-based features within the collection.


Embedding

  • Provider specifies the embedding provider used to generate vector representations of documents.
  • Model defines the embedding model used to convert document content into vectors for semantic search.

Reranker

  • Provider specifies the reranker provider used for improving search result relevance.
  • Model defines the reranker model used to re-score and reorder search results based on relevance.

LLM

  • Provider specifies the Large Language Model provider used for AI-powered features.

  • Model defines the LLM used for tasks such as document enrichment, summaries, and SmartFAQs.

  • These settings override global configurations and apply only to the current collection.

Best Practices

👍
  • Make sure the CSV file has a unique field, so it can be mapped correctly for indexing all records.

  • If the folder has multiple CSV files, keep the same column structure in all files.
  • Check that quotes are properly opened and closed, and ensure the quote character is correctly set in the settings.
  • If a quote is opened but not closed, remove the quote character from that field.
  • If you have multiple collections, always schedule indexing so only 2–3 collections index at the same time.