SearchBlox

SearchBlox Developer Documentation

Welcome to the SearchBlox developer documentation. Here you will find comprehensive technical documentation to help you start working with SearchBlox as quickly as possible, as well as support if you get stuck. Let's jump right in!

Guides

Amazon S3 Collection

You can create an Amazon S3 collection by following the following steps.

Creating Amazon S3 Collection

  • After logging in to the Admin Console, click Add Collection button. The Add Collection screen will be displayed.
  • Enter a unique name for your collection (for example, AmazonS3).
  • Select Amazon S3 collection radio button.
  • Click Add to create the collection.

Amazon S3 Collection Settings

  • The Settings sub-tab holds settings for Amazon S3 and tunable parameters for the search.
  • Amazon S3 settings must be set explicitly in the Amazon S3 collections.
  • The mandatory fields for AmazonS3 collection are
    • Access key
    • Secret key
    • Bucket name
  • SearchBlox also comes pre-configured with few other Amazon S3 parameters like includes, excludes when a new collection is created.
  • The following table has the list of settings for Amazon S3 Collection

Field

Description

Access Key

Access key from Amazon S3 security credentials.
Mandatory field.

Secret Key

Security key from Amazon S3 security credentials.
Mandatory field.

Name

Optional name.

Bucket

Amazon S3 bucket to index.
Mandatory field.

Path Prefix

Path prefix to index in this bucket example: Work/.
This is optional. If specified, it should be an existing path with the trailing /.

Includes

File types to be included. example: .pdf, .jpg.

Excludes

File types to be excluded. example: *.zip.

Keyword-in-Context Display

The keyword-in-context returns search results with the description displayed from content areas where the search term occurs.

Boosting

Boost search terms for the collection by setting a value greater than 1 (maximum value 9999).

Stemming

When stemming is enabled, inflected words are reduced to root form. For example, "running", "runs", and "ran" are the inflected form of "run".

Spelling Suggestions

When enabled, a spelling index is created at the end of the indexing process.

📘

Additional Note

Do not log transactions to S3 buckets since those log files will also be indexed, increasing bandwidth usage. If logging is needed, then disallow the log files by excluding them (using extensions) in Collection Settings.

Indexing and Other Operations

The following operations can be performed in AmazonS3 collection:

Activity

Description

Index

Starts the indexer for the selected collection.

Clear

Clears the current index for the selected collection.

Scheduled Activity

For each collection, any of the following scheduled indexer activity can be set:
Index - Set the frequency and the start date/time for indexing a collection.
Clear - Set the frequency and the start date/time for clearing a collection.

  • Indexer activity is controlled from the Index sub-tab in the collection. The current status of an indexer for a particular collection is indicated.
  • Indexing operation starts the indexer for the Amazon S3 collection.
  • On reindexing that is, clicking on index again after the initial index operation, all crawled documents will be reindexed. If documents have been deleted from S3 since the first index operation, they will be deleted from the index. New documents will also be indexed.
  • Also, indexing is controlled from the Index sub-tab for a collection or through API. The current status of a collection is always indicated on the Collection Dashboard and the Index page.
  • Index operation can also be initiated from the Collection Dashboard.
  • Scheduling can be performed only from the Index sub-tab.

Schedule Frequency

Schedule Frequency supported in SearchBlox is as follows:

  • Once
  • Every Minute
  • Hourly
  • Daily
  • Every 48 Hours
  • Every 96 Hours
  • Weekly
  • Monthly

👍

Best Practices

  • It is mandatory to provide access key, secret key, bucket name, and update rate in S3 collection settings.
  • It is possible to include or exclude file types using collection settings. Please use the same to avoid indexing unnecessary file types.
  • Do not schedule the same time for index and clear operations
  • If you have multiple collections, always schedule the activity to prevent more than 2-3 collections indexing at the same time.

Updated 3 months ago



Amazon S3 Collection


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.