OneDrive Collection

A OneDrive Collection enables SearchBlox to connect to your Microsoft OneDrive or SharePoint environment using Azure AD authentication and index files and folders for search.

Once configured, users can search OneDrive and SharePoint content directly from the SearchBlox search interface without needing to switch between applications.

Before You Begin

Before configuring a OneDrive Collection, ensure you have the following:

  • An Azure AD application registration with the following details:

    • Client ID
    • Client Secret
    • Tenant ID
  • An Access Token and Refresh Token obtained from your Azure AD application's OAuth 2.0 configuration.

  • The Drive ID of the OneDrive or SharePoint document library you want to index. This can be retrieved using the Microsoft Graph Explorer.

  • A defined Root Path:

    • Use / to index all files and folders within the drive.
    • Specify a folder path such as /Documents/Reports to index only a specific folder and its contents.

Creating a Onedrive Collection

You can create a Onedrive Collection by following these steps:

  • Log in to the Admin Console, go to the Collections tab, and click Create or the “+” icon.

  • Select Onedrive Collection as the collection type.

  • Enter a unique name for the collection. The name must contain 3–36 alphanumeric characters, and only underscores (_) are allowed.

  • Enable or disable RAG (Retrieval Augmented Generation) depending on your requirement. Enable it if the collection will be used for AI-powered search or chatbot responses.

  • Enable Knowledge Graph if you want SearchBlox to extract entities and relationships from the documents in the collection.

  • Choose whether the collection should be Private or Public. Enable Private Collection Access to restrict the collection to authenticated users only.

  • Configure Collection Encryption if you want to encrypt document content or specific metadata fields.

  • Select the Collection Language based on the language used in the documents. The default language is English.

  • Click Create to create the OneDrive Collection.

  • After the collection is created, you will be redirected to the OneDrive Settings / Authentication section to configure the connection and access details.


Configuring OneDrive Settings

To configure OneDrive or SharePoint integration for your collection, follow these steps:

  1. Go to the OneDrive Settings tab within the collection.

  2. Enter the Client ID.
    This is the Application (Client) ID obtained from your Azure Active Directory (Azure AD) app registration. It uniquely identifies your application.

  3. Enter the Client Secret.
    Provide the client secret generated in your Azure AD app registration. This is used for secure authentication.

  4. Enter the Access Token.
    This is the OAuth access token used to access Microsoft Graph APIs on behalf of the application. It allows temporary access to OneDrive or SharePoint resources.

  5. Enter the Refresh Token.
    The refresh token is used to generate a new access token when the current one expires, ensuring continuous access without re-authentication.

  6. Enter the Drive ID.
    Specify the unique ID of the OneDrive or SharePoint document library that you want to crawl and index.

  7. Enter the Tenant ID.
    This is the Directory (Tenant) ID from your Azure AD. It identifies your organization within Microsoft’s cloud services.

  8. Enter the Root Path.
    Define the starting path within the drive for crawling.

    • Use / for the root directory
    • Or specify a folder path like /Documents/Reports
  9. Click Save to store the configuration and enable the system to access and index files from OneDrive or SharePoint.



Schedule and Index

Sets the frequency and the start date/time for indexing a collection. Schedule Frequency supported in SearchBlox is as follows:

  • Once
  • Hourly
  • Daily
  • Every 48 Hours
  • Every 96 Hours
  • Weekly
  • Monthly

The following operation can be performed in Azure blob collections

ActivityDescription
Enable Scheduler for IndexingOnce enabled, you can set the Start Date and Frequency
ScheduleFor each collection, indexing can be scheduled based on the above options.
View all SchedulesRedirects to the Schedules section, where all the Collection Schedules are listed.


Manage Documents Tab

  • Using Manage Documents tab we can do the following operations:

    1. Filter
    2. View content
    3. View metadata
    4. Refresh
    5. Delete
  • To delete a file from your collection, enter the file path and click "Delete".

  • To see the status of an indexed file, click "View Metadata".

OneDrive Collection Models

The Models page allows you to configure and override AI models used for embeddings, reranking, and LLM-based features within the collection.


Embedding

  • Provider specifies the embedding provider used to generate vector representations of documents.
  • Model defines the embedding model used to convert document content into vectors for semantic search.

Reranker

  • Provider specifies the reranker provider used for improving search result relevance.
  • Model defines the reranker model used to re-score and reorder search results based on relevance.

LLM

  • Provider specifies the Large Language Model provider used for AI-powered features.

  • Model defines the LLM used for tasks such as document enrichment, summaries, and SmartFAQs.

  • These settings override global configurations and apply only to the current collection.