Content Indexed
This page describes the fields that SearchBlox indexes when crawling web content and how to configure custom meta fields for search filters and facets.
Indexed Fields
When SearchBlox crawls a web page, it stores all page content in a searchable content field. It also indexes the following predefined fields:
| Field | Description |
|---|---|
| Title | Page title |
| Description | Meta description of the page |
| Keywords | Meta keywords associated with the page |
| URL | Page URL |
| Last Modified | Date when the page was last modified |
| Size | Size of the document |
File-Specific Fields
The following fields are indexed for files such as PDFs that are discovered and indexed through a WEB Collection:
| Field | Description |
|---|---|
| Author | Author of the document |
| Doc_creation_date | Date when the document was created |
| Doc_modification_date | Date when the document was last modified |
Note: Map any additional fields you require in the
mapping.jsonfile.
Custom Meta Fields
Custom meta fields from your web pages are automatically indexed and can be used in search filters.
Example:
query=test&filter=custom:value
To make a custom meta field searchable, map it to the content field in the mapping.json file located in ../ROOT/WEB-INF:
"custom": {
"type": "text",
"store": true,
"fielddata": true,
"analyzer": "sb_analyzer",
"copy_to": "content"
}
To use a custom meta field as a facet field, configure it without the copy_to parameter:
"custom": {
"type": "text",
"store": true,
"fielddata": true,
"analyzer": "sb_analyzer"
}
Important: After updating
mapping.jsonfor facet fields, you must clear and reindex the collection for the changes to take effect.
For more information, see *Custom Fields in Search*.
