Collections
Logical grouping of documents that share common indexing rules and storage paths.
In SearchBlox, a collection is a fundamental organizational unit used to group and manage documents that have been indexed based on predefined rules and file paths. These rules define how documents are crawled, processed, and indexed. Collections enable efficient organization and retrieval of data, ensuring that search queries are executed against a well-defined subset of documents.
There are several different types of collections that can be created in SearchBlox:
- Web Collection
- Dynamic Auto Collection
- File System Collection
- Email Collection
- Database Collection
- MongoDB Collection
- CSV Collection
- Amazon S3 Collection
- Custom Collection
- AEM Collection
- Combined Collection
- SharePoint Online Collection
- SharePoint Server Collection
- Confluence Collection
- Google Drive Collection
Creating a Collection
- After logging in to the Admin Console, select the Create tab on the header panel and click on Collection icon.
- Choose a Collection Type.


- Choose a descriptive and unique name to easily identify the collection. This cannot be changed after creation.
- Enable/Disable RAG, which is a Chunk into paragraphs, embed into vectors and enable for Hybrid RAG search
- Choose Private/Public Collection Access and Collection Encryption as per the requirements. Use private access for sensitive content and public access for general information.
- Choose the language of the content (if the language is other than English). Choose the language that matches the content to improve search accuracy and relevance.
- Click Save to create the collection.

Note
- Private collections can be accessed only through secure search authentication.
- Public collections can be accessed with or without secure search authentication.
Collection Dashboard Items
The Collections page displays the following headers:
- ID (Collection ID number)
- Type (Collection Type)
- RAG (Retrieval Augmented Generation)
- Collection Name (Unique Collection Name)
- Status (Indexing or Ready)
- Last Updated (Date and time the index was last updated)
- Documents (Number of documents currently in the index)
- Queries (Number of queries that each collection has processed)
- Language (Language used in the indexed data)
- Hybrid (Hybrid Search Type)
- Actions (Possible actions that can be performed such as index, refresh, clear, configure collection, search, delete and clone)

Updated 23 days ago
What’s Next