Best Practices
This page covers recommended practices for setting up, configuring, and maintaining Database Collections in SearchBlox for reliable and efficient indexing.
Database Query
- Provide a valid SQL query that works in your database application. Read: SQL Query for Database Collection
- Always map a unique or primary key field to
uid. This helps prevent the creation of duplicate documents while reindexing. - Map a relevant field as
titleso search results display meaningful headings. - Map a field to
contentto ensure the full text is searchable. - If no unique field exists in your table, create a database view that generates a unique ID using a combination of fields and use that view in your
SELECTquery. - Keep your SQL query on a single line with no line breaks — line breaks can cause indexing failures.
- For external connectors like Salesforce or Gmail, write all SQL keywords (
SELECT,FROM,AS) in uppercase. - To index more than 100 rows from an external connector, include a
LIMITvalue in your query. Without it, only the first 100 rows are indexed.
Database Settings
- Enter valid settings for the database.
- Provide either the database settings or the full database URL string, depending on the selected database type.
- Test the SQL query and database settings by clicking Test SQL on the Database Collection Settings page. Ensure it is successful before indexing.
Important: Indexing will not work if the connection test has not passed. Never skip this step.
Indexing
- Set a schedule for indexing according to how often the collection should be reindexed.
- The minimum schedule interval is Daily. If indexing all records takes more than 24 hours, use a longer interval (e.g., every 2 or 4 days).
- After indexing completes, reindexing will follow the specified schedule.
- To check the status of records indexed please go to index.log in the path
<SEARCHBLOX_INSTALLATION_PATH>/webapps/ROOT/logsand look for entries under the database collection name.
Performance
- Use specific
SELECTqueries instead ofSELECT *where possible — fetching only the fields you need reduces indexing time and resource usage. - If you notice missing documents after a scheduled reindex, increase the SQL Settings → Fetch Size value in the collection Settings page. The default is 100 records per request.
- For large datasets, consider breaking the collection into smaller collections by category or date range to improve indexing speed and manageability.
Security
- Never store plain text passwords in SQL queries — use database user accounts with the minimum permissions needed for read access.
- For encrypted collections, ensure the correct users have sensitive access enabled before indexing. See Collection Encryption for details.
- Regularly review which users have access to each Database Collection, especially for collections containing sensitive or confidential data.
