GitHub Connector

Configuring SearchBlox

Before using the GitHub Connector, SearchBlox has to be installed and set up successfully. Then create a Custom Collection.


Configuring GitHub Connector

  • All the files related to the connector should be available in the same folder that is, all files should be extracted into the same folder.
  • Create a data folder on your drive where the files would be temporarily stored and mention in yml files.

Contact [email protected] to request the download link for SearchBlox Github connector. The following steps include the example paths for both Windows as well as Linux. In Windows, the connector would be installed in the C drive. In Linux, the connector has to be installed in /opt.

Steps to Configure and Run the GitHub Connector

  • Download the SearchBlox GitHub connector. Extract the downloaded zip to a folder.
  • Unzip the archive under C:* or /opt*.



In Linux, make sure that necessary permissions have been provided to the folder /opt by using the CHMOD command for writing log files and executing jar files.

  • Configure the githubconnector.yml file which includes Github properties and SearchBlox properties as listed in the following:
usernameUser Name in GitHub
passwordPassword in GitHub
data-directoryData Folder where the data needs to be stored. Make sure it has write permission.
api-keySearchBlox API Key
colnameThe name of the custom collection in SearchBlox.
urlSearchBlox URL
githuburlGitHub URL
public-reposIf public repos are to be indexed give the value as true otherwise false.
By default the value would be true , otherwise all repos in public would start to get indexed.
exclude-reporepos to exclude.
include-usersUser repos to include.
include-orgsOrganization repos to include.
exclude-filesFiles not to be indexed from GitHub.
exclude-formatsFile formats to exclude in GitHub.
exclude-foldersFolders to exclude in GitHub.
max-folder-sizeMaximum size of static folder after which it should be sweeped in MB.
servlet url & delete-api-url:Make sure that the port number is right. If your SearchBlox runs in 8080 port the URLs should be right.

Please give relevant details based on your requirement. Also please make sure to use code editor tool (for example notepad++) while editing yml file.

  • The content details of githubconnector.yml are provided here:
#User credentials
username: gauthamiv
password: 1at08cs112
#Data Folder where the data needs to be stored Make sure it has write permission
data-directory: E:\GoWorkspace\searchblox\src\sbgoclient\examples\gitHubConnector
#SearchBlox API Key
api-key: 03E2089E0E3D7580788B6E7DB3404305
#The name of the collection
colname: github
#SearchBlox URL
url: http://localhost:8080/searchblox/rest/v2/api/
#github url
#Public repo
public-repos: false
#Repos to exclue
exclude-repo: ["grit","merb-core","rubinius"]
#User repos to include
include-users: []
#Organization repos to include
include-orgs: []
#The Excluded Files wont be indexed
#The Excluded formats wont be indexed
exclude-formats: [.pem,.key,.gif,.lib,.pdb,.dll,.sh]
#The Excluded folders wont be indexed. Note:no trailing or leading slashes Eg: test/searchblox
#exclude-folders: [Data Dictionary,Guest Home,Imap Attachments,IMAP Home,User Homes,Shared,swsdp]
servlet-url: http://localhost:8080/searchblox/servlet/SearchServlet
#maximum size of static folder aftre which it should be sweeped in MB
max-folder-size: 2
delete-api-url: http://localhost:8080/searchblox/api/rest/docdelete
  • Start running the gitHubConnector.exe file for Windows and ./gitHubConnectorLinux32 or ./gitHubConnectorLinux64 in Linux