Amazon S3 Data Source

The Amazon S3 Data Source indexes documents and objects from an Amazon S3 bucket into a SearchBlox collection, making the S3 content searchable using SearchBlox.

Configuring SearchBlox

Before using the Amazon S3 Data Source, make sure SearchBlox is installed properly and create a Custom Collection.

Configuration details of Amazon S3 Data Source**

Accessing Connector UI

encrypted-akEncrypted Amazon S3 access key (encrypter available in the downloaded archive).
encrypted-skEncrypted Amazon S3 secret key (encrypter available in the downloaded archive).
regionRegion of your Amazon S3 instance.
data-directoryFolder where data will be stored. Make sure it has write permission.
api-keySearchBlox API Key
colnameName of the custom collection in SearchBlox.
queuenameAmazon S3 SQS parameter, required for updating documents after indexing.
private-bucketsSet to true to index content from private buckets.
public-bucketsSet to true to index content from public buckets.
urlSearchBlox URL
includebucketList of S3 buckets to include.
include-formatsFile formats to include.
expiring-urlExpiring URLs in search results (default is expiring URL).
expire-timeExpire time for URLs (default is 300 minutes).
permanent-url:Permanent URLs in search results (expiring URLs are preferred if both exist).
max-folder-sizeMaximum size of a folder (in MB) before it is swept..
servlet url & delete-api-url:Make sure the port number is correct; for example, use 8443 if SearchBlox runs on port 8443.

Steps to trigger notification when a document is updated in S3

  1. In Amazon S3 console, go to Services → Simple Queue Service (SQS) under Application Integration.
  2. Click Create New Queue, give a name, and configure the queue.

  1. Set Message Visibility Timeout, Retention Days, and Receive Message Wait Time.

  1. Set Receive Message Wait Time between 1–20 seconds to enable long polling.
  2. In the S3 console, select the bucket for which you want SQS, go to Properties → Advanced Settings → Events.

  1. Click Add Notification, give a name, and select the events you want notifications https://files.readme.io/e00dec7-4.png

  1. Under Send To, select SQS Queue, enter the name of the queue created earlier, and click Save.

  1. Now, whenever a document in the bucket changes, a notification will be sent to SQS.

  2. Provide the queue names in the S3 connector YAML file to trigger indexing whenever a document is updated.