SearchBlox

SearchBlox Developer Documentation

Welcome to the SearchBlox developer documentation. Here you will find comprehensive technical documentation to help you start working with SearchBlox as quickly as possible, as well as support if you get stuck. Let's jump right in!

Guides

CSV Collection

CSV collection is used to index records from the CSV file.

Creating CSV Collection

You can create a CSV collection by using the following steps.

  • After logging in to the Admin Console, click Add Collection button. The Add Collection screen will be displayed.
  • Enter a unique name for your collection (for example, CSV).
  • Select CSV collection radio button.
  • Click Add to create the collection.

CSV Collection Settings

  • The Settings sub-tab holds settings for CSV and tunable parameters for the search.
  • CSV setting values must be set explicitly for CSV collections.
  • The mandatory settings for CSV collection are
    • Folder
    • Unique field
  • It is required to map a unique field in CSV in CSV collection settings. Only if the mapped field is unique all records in the CSV file will be indexed.
  • SearchBlox also comes pre-configured with few additional parameters when a new collection is created which can be modified as required.
  • The following table has the list of settings available in CSV collection
FieldDescriptioin
FolderThe folder path where the CSV file is available.
PollPolling time for the collection to re-index.
Please give a numerical value, the value given will be in minutes.
Field SeparatorCSV files are separated by a comma so “,” is given (default value).
Escape CharacterThe escape character is “;” (default value).
Quote CharacterQuote, value is a single quote “’” (default value).
Use first record as headerIf the first record in the CSV file has to be taken as the header then check this box.
Unique FieldThis unique field should have the name of the CSV column that has unique values in each row.
This value is very important for indexing and searching values from the CSV file indexed.
Keyword-in-Context DisplayThe keyword-in-context returns search results with the description displayed from content areas where the search term occurs.
BoostingBoost search terms for the collection by setting a value greater than 1 (maximum value 9999).
StemmingWhen stemming is enabled, inflected words are reduced to a root form. For example, "running", "runs", and "ran" are the inflected form of "run".
Spelling SuggestionsWhen enabled, a spelling index is created at the end of the indexing process.

❗️

Important Note:

Note: If the Unique Field values are not unique, the CSV collection results will match the number of records in the CSV file.

Indexing and Other Operations

The following operations can be performed in CSV collection.

IndexStarts the indexer for the selected collection.
ClearClears the current index for the selected collection.
Scheduled ActivityFor each collection, any of the following scheduled indexer activity can be set:
Index - Set the frequency and the start date/time for indexing a collection.
Clear - Set the frequency and the start date/time for clearing a collection.
  • Indexer activity is controlled from the Index sub-tab in the collection. The current status of an indexer for a particular collection is indicated.
  • Indexing operation starts the indexer for the Database collection.
  • On reindexing that is, clicking on index again after the initial index operation, all crawled documents will be reindexed. If documents have been deleted from Database since the first index operation, they will be deleted from the index. New documents will also be indexed.
  • Also, indexing is controlled from the Index sub-tab for a collection or through API. The current status of a collection is always indicated on the Collection Dashboard and the Index page.
  • Index operation can also be initiated from the Collection Dashboard.
  • Scheduling can be performed only from the Index sub-tab.

Viewing Search Results for CSV Collections

  • Customized facets can be added to the index.html page. The facet can be any value in the table that can be used to filter results.
  • The results can also be viewed in JSON format by clicking the CSV search results in a regular search: http://localhost:8080/searchblox/search.jsp. See the following:
[
  {"keywords":" 10 11 20 0 0 1 0 61  8   SFN 1 2 0 0 5 5 32 0 1 NL 0 0 8 2004 0.41 6.75 aardsda01",
   "description":" 10 11 20 0 0 1 0 61  8   SFN 1 2 0 0 5 5 32 0 1 NL 0 0 8 2004 0.41 6.75 aardsda01",
   "created_at":"2015-09-14T05:04:03.345Z",
    "_autocomplete":" 10 11 20 0 0 1 0 61  8   SFN 1 2 0 0 5 5 32 0 1 NL 0 0 8 2004 0.41 6.75 aardsda01",
    "source":"        
            {\"BB\":\"10\",\"G\":\"11\",\"H\":\"20\",\"IBB\":\"0\",\"BK\":\"0\",\"HR\":\"1\",\"L\":\"0\",\"BFP\":\"61\",
            \"GIDP\":\"\",\"R\":\"8\",\"SF\":\"\",\"SH\":\"\",\"teamID\":\"SFN\",\"W\":\"1\",\"HBP\":\"2\",\"WP\":\"0\",
            \"SHO\":\"0\",\"SO\":\"5\",\"GF\":\"5\",\"IPouts\":\"32\",\"SV\":\"0\",\"stint\":\"1\",\"lgID\":\"NL\",\"CG\":\"0\",
            \"GS\":\"0\",\"ER\":\"8\",\"yearID\":\"2004\",\"BAOpp\":\"0.41\",\"ERA\":\"6.75\",\"playerID\":\"aardsda01\"}",
     "title":"1",
     "content":" 10 11 20 0 0 1 0 61  8   SFN 1 2 0 0 5 5 32 0 1 NL 0 0 8 2004 0.41 6.75 aardsda01",
     "contenttype":"csv",
     "uid":"1"
   }
]

👍

Best Practices

  • Please ensure to have a unique field in the CSV file so that it can be mapped to a unique id in the collection settings for indexing all records successfully.
  • Please ensure that there are open and close quotes if available, and the quote character is rightly specified in the settings.
  • If there is an open quote and no close quote, please remove the quote character.
  • Do not schedule the same time for two operations (Index, Clear).
  • If you have multiple collections, always schedule the activity to prevent more than 2-3 collections indexing at the same time.

📘

Setup ReactJS Search UI for CSV Datasets

Updated 15 days ago



CSV Collection


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.