# **Configuring SearchBlox**

Before installing a network crawler, install SearchBlox successfully, then create a [**Custom Collection**](🔗).


# **Installing the Network Crawler**

Contact [[email protected]](🔗) to get the download link for SearchBlox-network-crawler.

Download the latest version of SearchBlox-network-crawler. Extract the downloaded zip to /opt/searchblox-network in Linux or C: /searchblox-network in Windows.

# **Configuring SMB**

The extracted folder will contain a folder named /conf, which contains all the configurations needed for the crawler.

**Config.yml** This is the configuration file that is used to map SearchBlox to the network crawler. Edit the file in your favorite editor.

_apikey_: This is the API Key of your SearchBlox instance. You can find it in the Admin tab of the SearchBlox instance.

_colname_: Name of the custom collection which you created.

_colid_: The Collection ID of the collection you created. It can be found in the Collections tab near the collection name from the SearchBlox instance.

_url_: The URL of SearchBlox instance.

_sbpkey_: This is the SB-PKEY of your SearchBlox instance. You can find it in the Users tab of the SearchBlox instance. Only Admin role users will be given with SB-PKEY. Please create a admin user if you have not created one.

**searchblox.yml** This is the Elasticsearch configuration file that is used by SearchBlox network crawler. Edit the file in your favorite editor.

_searchblox.elasticsearch.url_: URL used by Elasticsearch with port. If you use IP or domain please configure this setting.

_searchblox.elasticsearch.host_: Hostname used for Elasticsearch.

_searchblox.elasticsearch.port_: Port used for Elasticsearch.

_searchblox.basic.username_: Username for Elasticsearch

_searchblox.basic.password_: Password for Elasticsearch

_es.home_: Windows or Linux Path to mentioned based on the OS type you use. For Linux the path will be `/opt/searchblox/elasticsearch`

**windowsshare.yml** Enter the details of the domain server, authentication domain, username, password, folder path, disallow path, allowed format and recrawl interval in C:/searchblox-network/conf/windowsshare.yml. You can also enter details of more than one server, or more than one path in same server, in windowsshare.yml file.

You can find the details in the content of the file as shown here.

# **Starting the Crawler**

The crawler can be started with start.sh in Linux and start.bat in Windows. The crawler starts in the background, but you can see the logs in the logs folder.


  • You can only run one network crawler at a time. If you need to run the crawler for different paths or different servers, enter the details in the same network crawler in the Windowsshare.yml file.

  • To re-run the crawler in another collection, delete sb_network index using a tool that can communicate with Elasticsearch.

  • Network connector has to be stopped manually.

  • If plain passwords are not allowed in your server, enable the plain password using the following line in start.bat of the network connector: **-Djcifs.smb.client.disablePlainTextPasswords=false**

# **Searching Securely Using SearchBlox**

Enable Active Directory secure search under Search → Security settings as shown in the following. Secure Search can be used based on Active Directory configuration by enabling the checkbox for Secured Search and entering the required settings.

  • Select Enable Secured Search and configure ldap then Test the connection.


  • Enter the Active Directory details

**LDAP URL**LDAP URL that specifies base search for the entries
**Search Base**Search Base for the active directory
**Username**Admin username
**Password**Password for the username
**Filter-Type**Filter type could be default or document.
**Enable document filter**Enable this option to filter search results based on users
  • Once you setup security groups, Login using AD credentials here: [https://localhost:8443/search](🔗)

# **Admin Access to File Share**

If the SMB file share is available on another server on the same network and requires permission, run the SearchBlox server service with Admin access and enter your credentials. Running as Admin account or account with access to files only will help successfully index files from the share.

Make sure to run the network crawler as Admin in a similar manner.


## How to increase memory in Network Connector

**For Windows** Go to <network_crawler_installationPath>/start.bat and allocate more RAM by making changes in the following line rem set JAVA_OPTS=%JAVA_OPTS% -Xms1G -Xmx1G instead of 1G, enter 2G or 3G.

**For Linux** Go to <network_crawler_installationPath>/start.sh uncomment the following line and allocate more memory. JAVA_OPTS="$JAVA_OPTS -Xms1G -Xmx1G"

## Delete sb_network to rerun the crawler in another collection.

To rerun the network crawler in another collection, delete the sb_network index using a tool that can communicate with Elasticsearch. Go to [https://localhost:9200/_cat/indices](🔗) and check whether you can view the sb_network index.


Postman can be used to access Elasticsearch.

Start Postman and create a Postman request to delete an index, use the DELETE command as shown here:


Look for the "acknowledged": "true" message.

Check [https://localhost:9200/_cat/indices](🔗); **sb_network **index should not be available among the indices.

Rerun the crawler after making necessary changes to your config.yml.