SearchBlox

SearchBlox Developer Hub

Welcome to the SearchBlox developer hub. Here you will find comprehensive guides and documentation to help you start working with SearchBlox as quickly as possible, as well as support if you get stuck. Let's jump right in!

Guides

Confluence Connector

Configuring SearchBlox

Before using Confluence Connector, install SearchBlox successfully and create a Custom Collection.

Configuring Confluence Connector

Important points to note before running the connector

  • Fresh run confluence connector has to be run first.
  • All the files related to the connector should be available in the same folder i.e., all files should be extracted into the same folder.
  • Create a data folder on your drive where the files would be temporarily stored and mention in yml files.
  • Download the zoneinfo file and create a system variable as mentioned in the prerequisites section below

Steps for encrypting password

  • Password can be encrypted using a utitilty encrypt.exe available in the downloaded archive.
  • Run the exe file in command prompt as administrator and provide your password, the encrypted password would get generated.
  • Please copy the password and give it in your yml file instead of actual password. (Please refer the sample yml files in the downloaded archive)

Prerequisites

For Windows:
1) Download zoneinfo.zip into any directory (preferably C drive).
https://sbgoclient.s3.amazonaws.com/confluence/confluence_refresh/zoneinfo.zip
2) Add a system variable named ZONEINFO, and the value should be the path of zone info zip.
For example, System variable name ZONEINFO and value= C:\zoneinfo.zip
3) Save the Environment variable.

Linux
1) Download zoneinfo.zip into /usr.
https://sbgoclient.s3.amazonaws.com/confluence/confluence_refresh/zoneinfo.zip
2) Go to bash_profile file using the command below:
vi ~/.bash_profile
3) Add the following in the file and save:
export ZONEINFO=/usr/zoneinfo.zip

Note:

These steps are mandatory for the connectors to refresh the data. The above setup has to be done before running fresh connector.

Contact support@searchblox.com to request the download link for SearchBlox Confluence connector. The steps below include the example paths for both Windows as well as Linux. In Windows, the connector would be installed in the C drive. In Linux, the connector has to be installed in /opt.

Steps to Configure and Run the Confluence Fresh Run Connector

  • Download the SearchBlox Confluence connector. Extract the downloaded zip to a folder.
  • Unzip the archive under C:\ or /opt.

Note:

In Linux, make sure that necessary permissions have been provided to the folder /opt by using the CHMOD command for writing log files and executing jar files.

  • Configure the confluence.yml file which includes confluence properties and SearchBlox properties as listed below:

username

User Name for Confluence account

password

Encrypted Password for Confluence account

data-directory

Data Folder where the data needs to be stored. Make sure it has write permission.

api-key

SearchBlox API Key

colname

The name of the custom collection in SearchBlox.

url

SearchBlox URL

confluenceurl

Confluence URL

exclude-formats

File formats to exclude. Please give the extension of the file with dot operator as in the example
Eg: .war,.zip

Note: regex not allowed

exclude-folders

Folders to exclude in confluence. The subpath folder in the confluence URL to be excluded.
eg: folder1, folder2

Note: regex not allowed, full folder name has to be provided

max-folder-size

Maximum size of static folder after which it should be sweeped in MB.

log-file-maxSize

Megabytes after which new file is created.

og-file-maxBackups

Number of backups after which log file should be deleted.

log-file-maxAge

Number of days after which log files should be deleted.

Url, servlet url & delete-api-url:

Make sure that the port number is right. If your SearchBlox runs in 8080 port the URLs should be right.

  • The content details of confluence.yml are provided below:
#User credentials
username: test@searchblox.com
password: encrypted password
#Data Folder where the data needs to be stored Make sure it has write permission
data-directory: C:\goconfluence\data
#SearchBlox API Key
api-key: 00A329C64C688AB15EB519E50BDCE318
#The name of the collection
colname: custom
#SearchBlox URL
url: http://localhost:8080/searchblox/rest/v2/api/
#confluence URL
confluenceurl: https://searchblox.atlassian.net/wiki
#The Excluded formats wont be indexed
exclude-formats: [.war,.zip,.tar,.gz]
#The Excluded folders wont be indexed. Note:no trailing or leading slashes Eg: test/searchblox
exclude-folders: [SBTES,SC,sd,SCON]
#maximum size of static folder aftre which it should be sweeped in MB
max-folder-size: 2
#megabytes after which new file is created
log-file-maxSize: 10
#number of backups after which log file should be deleted
log-file-maxBackups: 10
#Number of days after which log files should be deleted
log-file-maxAge: 30 
#searchblox servlet url for auto delete functionality
servlet-url: http://localhost:8080/searchblox/servlet/SearchServlet
#searchblox delete api url for auto delete functionality
delete-api-url: http://localhost:8080/searchblox/api/rest/docdelete
  • Start running the confluence_fresh.exe file for Windows and ./confluence_fresh_linux32 or ./confluence_fresh_linux in Linux
  • A file named last_run_date_time.yml will be generated in the folder where the executables are available. This file is important to run the refresh connector

Steps to Configure and Run the Confluence refresh run Connector

  • Configure confluence_refresh.yml file similar to how the fresh run connector yaml file is configured. As mentioned above it includes both confluence and SearchBlox properties. The only additional field is time zone:
    customer-confluence-timezone: EST
    ref: https://searchblox.s3.amazonaws.com/Connectors/timezone_confluence.txt
    This is the timezone in which the confluence server is running.

customer-confluence-timezone

The timezone in which the confluence server is running.
eg: UTC, EST, CST, etc
customer-confluence-timezone: UTC
Please refer the below file for the zones to be given
https://searchblox.s3.amazonaws.com/Connectors/timezone_confluence.txt

The contents of the file will look like this:

#User credentials
username: test@searchblox.com
password: encryptedpassword
#Data Folder where the data needs to be stored Make sure it has write permission
data-directory: C:\goconfluence\data
#SearchBlox API Key
api-key: 00A329C64C688AB15EB519E50BDCE318
#The name of the collection
colname: custom
#SearchBlox URL
url: http://localhost:8080/searchblox/rest/v2/api/
#confluence URL
confluenceurl: https://searchblox.atlassian.net/wiki
#The Excluded formats wont be indexed
exclude-formats: [.war,.zip,.tar.gz]
#The Excluded folders wont be indexed. Note:no trailing or leading slashes Eg: test/searchblox
exclude-folders: [SBTES,SC,sd,SCON]
#maximum size of static folder aftre which it should be sweeped in MB
max-folder-size: 2
#megabytes after which new file is created
log-file-maxSize: 10
#number of backups after which log file should be deleted
log-file-maxBackups: 10
#Number of days after which log files should be deleted
log-file-maxAge: 30 
#The Excluded folders wont be indexed. Note:no trailing or leading slashes Eg: test/searchblox
#searchblox servlet url for auto delete functionality
servlet-url: http://localhost:8080/searchblox/servlet/SearchServlet
#searchblox delete api url for auto delete functionality
delete-api-url: http://localhost:8080/searchblox/api/rest/docdelete
#Mention time zone code as per your confluence host
customer-confluence-timezone: EST
  • Start running the confluence_refresh.exe file for Windows and ./confluence_refresh_linux32 or ./confluence_fresh_linux in Linux