XML Connector

XML Connector is used to index the content of XML files by parsing all the information within the tags available in the XML page/file. Header tags will be considered as meta tags in SearchBlox and body information will be taken as SearchBlox document’s description/content.

Configuring SearchBlox

Before using XML Connector, install SearchBlox successfully and create a Custom Collection.

1001

Configuring XML Connector

  • All the files related to the connector should be available in the same folder that is, all files should be extracted into the same folder.
  • The XML file should have data in the following format for the connector to work that is, the XML file should have the record tags within which the URL to be indexed should be specified along with other meta data content within metadata tags.
<?xml version="1.0"?>
<gsafeed>
	<header>
		<datasource>test</datasource>
		<feedtype>metadata-and-url</feedtype>
	</header>
	<group>
		<record url="https://www.example.com/ethical_hacking/ethical_hacking_tutorial.pdf">
			<metadata>
				<meta content="FOU" name="category" />
				<meta content="2018" name="applicable-reports" />
				<meta content="2018" name="display-evaluators" />
				<meta content="1d" name="criteria-2018-1" />
				<meta content="2018" name="show-on-www" />
			</metadata>
		</record>
		<record url="https://www.example.com/cprogramming/cprogramming_tutorial.pdf">
			<metadata>
				<meta content="SSV" name="category" />
				<meta content="2014" name="applicable-reports" />
				<meta content="2014" name="display-evaluators" />
				<meta content="3c" name="criteria-2014-3" />
			</metadata>
		</record>
		<record url="https://www.example.com/android/android_tutorial.pdf">
			<metadata>
				<meta content="SSV" name="category" />
				<meta content="2014" name="applicable-reports" />
				<meta content="3c" name="criteria-2014-3" />
			</metadata>
		</record>
    		<record url="https://www.searchblox.com/">
			<metadata>
				<meta content="SSV" name="category" />
				<meta content="2014" name="applicable-reports" />
				<meta content="3c" name="criteria-2014-3" />
			</metadata>
		</record>
    	</group>
</gsafeed>

Contact [email protected] to request the download link for SearchBlox XML connector. The following steps include the example paths for both Windows as well as Linux. In Windows, the connector would be installed in the C drive. In Linux, the connector has to be installed in /opt.

Steps to Configure and Run the XML Connector

  • Download the SearchBlox XML connector. Extract the downloaded zip to a folder.
  • Unzip the archive under C:* or /opt*.

🚧

Note:

In Linux, make sure that necessary permissions have been provided to the folder /opt by using the CHMOD command for writing log files and executing jar files.

  • Configure the xmlconfig.yml file which includes the directory path of the XML file and SearchBlox properties as listed in the following:
api-keySearchBlox API Key
colnameThe name of the custom collection in SearchBlox.
urlSearchBlox URL
data-directoryData Folder along with filename of XML from where the data needs to be fetched
log-file-maxSizeMegabytes after which new file is created
log-file-maxBackupNumber of backups after which log file should be deleted
log-file-maxAgeNumber of days after which log files should be deleted
  • The content details of xmlconfig.yml are provided here:
#SearchBlox API Key
api-key: 4CF0CDE62976F33C4BF290A2182E8377
#The name of the collection
colname: xml
#SearchBlox URL
url: http://localhost:8080/searchblox/rest/v2/api/
#Data Folder of xml from where the data needs to be fetched
data-directory: C:\CONNECTORS\xmlconnector\selfstudy-metadata.xml
log-file-maxSize: 10
#number of backups after which log file should be deleted
log-file-maxBackups: 10
#Number of days after which log files should be deleted
log-file-maxAge: 30
  • Start running the xmlconnector.exe file for Windows and ./xml_connectorLinux64 in Linux
655