XML Data Source
XML Data Source is used to index the content of XML files by parsing all the information within the tags available in the XML page/file. Header tags will be considered as meta tags in SearchBlox and body information will be taken as SearchBlox document’s description/content.
Configuring SearchBlox
Before using XML Data Source, install SearchBlox successfully and create a Custom Collection.
Configuring XML Data Source
- Download the SearchBlox Connector UI. Extract the downloaded zip to a folder.
Contact [email protected] to request the download link for SearchBlox Connectors UI. The following steps include the example paths for both Windows as well as Linux. In Windows, the connector would be installed in the C drive.
- Unzip the archive under C:* or /opt*.
- Create a data folder on your drive where the files would be temporarily stored.
- Configure the following properties once you create a data source in the connector UI.
- The XML file should have data in the following format for the connector to work that is, the XML file should have the record tags within which the URL to be indexed should be specified along with other meta data content within metadata tags.
<?xml version="1.0"?>
<gsafeed>
<header>
<datasource>test</datasource>
<feedtype>metadata-and-url</feedtype>
</header>
<group>
<record url="https://www.example.com/ethical_hacking/ethical_hacking_tutorial.pdf">
<metadata>
<meta content="FOU" name="category" />
<meta content="2018" name="applicable-reports" />
<meta content="2018" name="display-evaluators" />
<meta content="1d" name="criteria-2018-1" />
<meta content="2018" name="show-on-www" />
</metadata>
</record>
<record url="https://www.example.com/cprogramming/cprogramming_tutorial.pdf">
<metadata>
<meta content="SSV" name="category" />
<meta content="2014" name="applicable-reports" />
<meta content="2014" name="display-evaluators" />
<meta content="3c" name="criteria-2014-3" />
</metadata>
</record>
<record url="https://www.example.com/android/android_tutorial.pdf">
<metadata>
<meta content="SSV" name="category" />
<meta content="2014" name="applicable-reports" />
<meta content="3c" name="criteria-2014-3" />
</metadata>
</record>
<record url="https://www.searchblox.com/">
<metadata>
<meta content="SSV" name="category" />
<meta content="2014" name="applicable-reports" />
<meta content="3c" name="criteria-2014-3" />
</metadata>
</record>
</group>
</gsafeed>
Note:
In Linux, make sure that necessary permissions have been provided to the folder /opt by using the CHMOD command for writing log files and executing jar files.
api-key | SearchBlox API Key |
colname | The name of the custom collection in SearchBlox. |
url | SearchBlox URL |
data-directory | Data Folder along with filename of XML from where the data needs to be fetched |
log-file-maxSize | Megabytes after which new file is created |
log-file-maxBackup | Number of backups after which log file should be deleted |
log-file-maxAge | Number of days after which log files should be deleted |
Updated over 3 years ago