Azure DevOps Collection
SearchBlox includes a crawler, which allows you to crawl project folders and files in Azure DevOps Organization. Azure DevOps Collection can be created by following the steps given below.
Creating Azure DevOps Collection
You can Create a Confluence Collection with the following steps:
-
After logging in to the Admin Console, select the Collections tab and click on Create a New
Collection or "+" icon. -
Choose Azure DevOps Collection as Collection Type
-
Enter a unique name for the collection (for example, Project).
-
Enable/Disable RAG, enable for ChatBot and Hybrid RAG search.
-
Choose Private/Public Collection Access and Collection Encryption as per the requirements.
-
Choose the language of the content (if the language is other than English).
-
Click Save to create the collection.
-
Once the Azure DevOps collection is created you will be taken to the
Azure DevOps Settingstab.
Azure DevOps Settings Tab
| Field | Description |
|---|---|
| Organization | Name of the Organization on which Azure DevOps account is created. Can be found from Azure DevOps endpoint : <https://dev.azure.com/><Organization-name>/ Or while generating PAT Token. |
| Project | Name of the Project whose documents needs to be indexed. Found on the Login or Home Page with Projects section |
| Scope | Scope of the given project, commonly it will be of the following form: $/<Project-Name> Or can be found in Repos section in Azure DevOps Account. |
| PAT Token | PAT Token is a unique token that can be generated from your Azure DevOps dashboard. Token can be generated from here: https://dev.azure.com/<Organization>/_usersSettings/tokensReplace the Organization name in the above URL OR Click on User Settings tab and select Personal Access Token in Azure DevOps Account. |
- Provide the Organization, Project, Scope and PAT Token.
- Click on
SaveandTest Connection.
| Settings | Description |
|---|---|
| Title | Generates concise and relevant titles for the indexed documents using LLM. |
| Description | Generates the description for indexed documents using LLM. |
| Topic | Generates relevant topics for indexed documents using LLM based on document's content. |
| Auto Relevance | Enable/Disable Hybrid Search for automatic relevance ranking |
- Click on
SaveandTest Connection.
Settings Tab
The Settings sub-tab holds tunable parameters for the Azure DevOps crawler and the indexer. SearchBlox comes pre-configured with parameters when a new collection is created. The settings that can be configured from SearchBlox are listed as follows.
Generate Title, Description and Topics using SearchAI PrivateLLM and Enable Hybrid Search:
- Choose and enable
Generate Using LLMandAuto Relevance
- By clicking
Compare Keyword Search with Hybridwill redirect to the Comparison Plugin
Settings Description Title Generates concise and relevant titles for the indexed documents using LLM. Description Generates the description for indexed documents using LLM. Topic Generates relevant topics for indexed documents using LLM based on document's content. Auto Relevance Enable/Disable Hybrid Search for automatic relevance ranking
| Setting | Description |
|---|---|
| Remove Duplicates | When enabled, prevents indexing of duplicate documents. |
| Stemming | When stemming is enabled, inflected words are reduced to root form. For example, "running", "runs", and "ran" are the inflected form of run. |
| Spelling Suggestions | Provide spelling suggestions for the collection. The default is YES. |
| Keyword-in-Context Display | The keyword-in-context returns search results with the description displayed from content areas where the search term occurs. |
| HTML Parser Setting | The setting configures the HTML parser to read the description for a document from one of the HTML tags: META, H1, H2, H3, H4, H5, H6. |
| Maximum Document Age | Specifies the maximum allowable age in days of a document in the collection. |
| Maximum Document Size | Specifies the maximum allowable size in kilobytes of a document in the collection. |
| Logging | Provides the indexer activity in detail in <SearchBlox_installation_dir>/webapps/ROOT/logs/index.logThe details that occur in the index.log when logging or debug logging mode are enabled are: - List of files that are crawled. - Processing done on each file along with timestamp on when the processing starts, whether the indexing process is taking place or URL gets skipped, and whether the file gets indexed. - All data will be available as separate entries in index.log. - Timestamp of when the indexing completed, and the time taken for indexing each page. - Last modified date of the file. - If the file is skipped or not, and why. |
| Enable Content API | Provides the ability to crawl the content with special characters included. |
Schedule and Index
Sets the frequency and the start date/time for indexing a collection. Schedule Frequency supported in SearchBlox is as follows:
- Once
- Hourly
- Daily
- Every 48 Hours
- Every 96 Hours
- Weekly
- Monthly
The following operation can be performed in Confluence collections
| Activity | Description |
|---|---|
| Enable Scheduler for Indexing | Once enabled, you can set the Start Date and Frequency |
| Schedule | For each collection, indexing can be scheduled based on the above options. |
| View all Schedules | Redirects to the Schedules section, where all the Collection Schedules are listed. |
Manage Documents Tab
-
Using Manage Documents tab we can do the following operations:
- Filter
- View content
- View metadata
- Refresh
- Delete
-
To delete a file from your collection, enter the file path and click "Delete".
-
To see the status of an indexed file, click "View Metadata".
Data Fields Tab
Using Data Fields tab we can create custom fields for search and we can see the Default Data Fields with non-encrypted collection. SearchBlox supports 4 types of Data Fields as listed below:
Keyword
Number
Date
Text
- Once the Data fields are configured, collection must be cleared and re-indexed to take effect.
To know more about Data Fields please refer to Data Fields Tab
Updated 6 months ago

