Using Lastmodified Date

Lastmodified Date in WEB Collection

Users can specify the source of the "last modified" date for documents in a Web Collection.


There are three options

  • Default
  • Meta
  • Custom Header

Default

The system retrieves the last modified date from the response header of the webpage. This is the default option when creating a new Web Collection

Meta

By selecting this option, the system extracts the last modified date from meta tags within the HTML document. Requires proper meta tag implementation on the web pages. Looks for meta tags with the name "lastmodified" or "last-modified".

<meta name="lastmodified" content="Wed, 20 Jan 2018 04:30:15 GMT" />
or
<meta name="last-modified" content="Wed, 20 Jan 2018 04:30:15 GMT" />

Custom Header

By selecting this option, the custom date set in the web server htaccess file will be taken as the last modified date. The date format of the custom last modified date value within your Apache Web server can be set as described in server documentation.

<IfModule mod_headers.c>
  Header set SearchBlox-Last-modified "Wed, 29 Feb 2020 15:33:18 GMT"
</IfModule>

If you plan to use the a custom name, you can update SearchBlox to use your custom name.

You can set the SearchBlox Last Modified header name to your custom name by editing the file <SEARCHBLOX_INSTALLATION_PATH>/webapps/ROOT/WEB-INF/searchblox.yml :

http.lastmodified.header: SearchBlox-Last-modified

Example:
http.lastmodified.header: Custom-Last-modified