Amazon S3 Collection

You can create an Amazon S3 collection by following the steps given below.

Creating Amazon S3 Collection

After logging in to the Admin Console, click Add Collection button. The Add Collection screen will be displayed.
Enter a unique name for your collection (for example, AmazonS3).
Select Amazon S3 collection radio button.
Click Add to create the collection.

Amazon S3 Collection Settings

The Settings sub-tab holds settings for Amazon S3 and tunable parameters for the search.
Amazon S3 settings must be set explicitly in the Amazon S3 collections.
The mandatory fields for AmazonS3 collection are
- Access key
- Secret key
- Bucket name
SearchBlox also comes pre-configured with few other Amazon S3 parameters like includes, excludes when a new collection is created.
The following table has the list of settings for Amazon S3 Collection

Field	Description
Access Key	Access key from Amazon S3 security credentials. Mandatory field.
Secret Key	Security key from Amazon S3 security credentials. Mandatory field.
Name	Optional name.
Bucket	Amazon S3 bucket to index. Mandatory field.
Path Prefix	Path prefix to index in this bucket example: Work/. This is optional. If specified, it should be an existing path with the trailing /.
Includes	File types to be included. example: .pdf, .jpg.
Excludes	File types to be excluded. example: *.zip.
Keyword-in-Context Display	The keyword-in-context returns search results with the description displayed from content areas where the search term occurs.
Boosting	Boost search terms for the collection by setting a value greater than 1 (maximum value 9999).
Stemming	When stemming is enabled, inflected words are reduced to root form. For example, "running", "runs", and "ran" are the inflected form of "run".
Spelling Suggestions	When enabled, a spelling index is created at the end of the indexing process.

📘
Additional Note

Do not log transactions to S3 buckets since those log files will also be indexed, increasing bandwidth usage.

If logging is needed, then disallow the log files by excluding them (using extensions) in Collection Settings.

Indexing and Other Operations

The following operations can be performed in AmazonS3 collection:

Activity	Description
Index	Starts the indexer for the selected collection.
Clear	Clears the current index for the selected collection.
Scheduled Activity	For each collection, any of the following scheduled indexer activity can be set: Index - Set the frequency and the start date/time for indexing a collection. Clear - Set the frequency and the start date/time for clearing a collection.

Indexer activity is controlled from the Index sub-tab in the collection. The current status of an indexer for a particular collection is indicated.
Indexing operation starts the indexer for the Amazon S3 collection.
On reindexing that is, clicking on index again after the initial index operation, all crawled documents will be reindexed. If documents have been deleted from S3 since the first index operation, they will be deleted from the index. New documents will also be indexed.
Also, indexing is controlled from the Index sub-tab for a collection or through API. The current status of a collection is always indicated on the Collection Dashboard and the Index page.
Index operation can also be initiated from the Collection Dashboard.
Scheduling can be performed only from the Index sub-tab.

Schedule Frequency

Schedule Frequency supported in SearchBlox is as follows:

Once
Every Minute
Hourly
Daily
Every 48 Hours
Every 96 Hours
Weekly
Monthly

👍
Best Practices

It is mandatory to provide access key, secret key, bucket name and update rate in S3 collection settings.

It is possible to include or exclude file types using collection settings. Please use them to avoid indexing unnecessary file types.

Do not schedule the same time for index and clear operations

If you have multiple collections, always schedule the activity to prevent more than 2-3 collections indexing at the same time.

Amazon S3 Collection

Creating Amazon S3 Collection

Amazon S3 Collection Settings

📘
Additional Note

Indexing and Other Operations

Schedule Frequency

👍
Best Practices

Creating Amazon S3 Collection

Amazon S3 Collection Settings

📘Additional Note

Indexing and Other Operations

Schedule Frequency

👍Best Practices

📘
Additional Note

👍
Best Practices