Best Practices

This page covers recommended practices for setting up, configuring, and maintaining Database Collections in SearchBlox for reliable and efficient indexing.

Database Query

  • Provide a valid SQL query that works in your database application. Read: SQL Query for Database Collection
  • Always map a unique or primary key field to uid. This helps prevent the creation of duplicate documents while reindexing.
  • Map a relevant field as title so search results display meaningful headings.
  • Map a field to content to ensure the full text is searchable.
  • If no unique field exists in your table, create a database view that generates a unique ID using a combination of fields and use that view in your SELECT query.
  • Keep your SQL query on a single line with no line breaks — line breaks can cause indexing failures.
  • For external connectors like Salesforce or Gmail, write all SQL keywords (SELECT, FROM, AS) in uppercase.
  • To index more than 100 rows from an external connector, include a LIMIT value in your query. Without it, only the first 100 rows are indexed.

Database Settings

  • Enter valid settings for the database.
  • Provide either the database settings or the full database URL string, depending on the selected database type.
  • Test the SQL query and database settings by clicking Test SQL on the Database Collection Settings page. Ensure it is successful before indexing.

Important: Indexing will not work if the connection test has not passed. Never skip this step.

Indexing

  • Set a schedule for indexing according to how often the collection should be reindexed.
  • The minimum schedule interval is Daily. If indexing all records takes more than 24 hours, use a longer interval (e.g., every 2 or 4 days).
  • After indexing completes, reindexing will follow the specified schedule.
  • To check the status of records indexed please go to index.log in the path<SEARCHBLOX_INSTALLATION_PATH>/webapps/ROOT/logs and look for entries under the database collection name.

Performance

  • Use specific SELECT queries instead of SELECT * where possible — fetching only the fields you need reduces indexing time and resource usage.
  • If you notice missing documents after a scheduled reindex, increase the SQL Settings → Fetch Size value in the collection Settings page. The default is 100 records per request.
  • For large datasets, consider breaking the collection into smaller collections by category or date range to improve indexing speed and manageability.

Security

  • Never store plain text passwords in SQL queries — use database user accounts with the minimum permissions needed for read access.
  • For encrypted collections, ensure the correct users have sensitive access enabled before indexing. See Collection Encryption for details.
  • Regularly review which users have access to each Database Collection, especially for collections containing sensitive or confidential data.