SearchBlox Developer Documentation

Welcome to the SearchBlox developer documentation. Here you will find comprehensive technical documentation to help you start working with SearchBlox as quickly as possible, as well as support if you get stuck. Let's jump right in!


Using Meta Robots

Using noindex,nofollow Meta Robots

HTTP Collection crawler can be controlled using Meta robots tags.

  • Rules for crawling based on meta robots would be considered after robots.txt and path settings specified in HTTP collections.
  • Using meta robots
    • User can avoid indexing a page, but crawl the same.
    • User can index a page, but avoid crawling the same.
    • User can avoid indexing as well as crawling the page.
    • User can index and crawl the page.

Meta Robots

These meta tags which direct the indexing and crawling of a page has to be specified in the header section of HTML code.

noindex, follow

  • To avoid indexing a page but allow crawling the following meta tag has to be specified in the page that is crawled:
    <meta name="robots" content="noindex, follow">

index, nofollow

  • To avoid crawling but allow indexing the following meta tag has to be specified:
    <meta name="robots" content="index, nofollow">

noindex, nofollow

  • To avoid both indexing and crawling the following meta tag has to be provided:
    <meta name="robots" content="noindex, nofollow">

index, follow

  • To index as well as crawl either you can avoid giving the meta robots or specify the same as shown:
    <meta name="robots" content="index, follow">

Using Content Exclusion Meta Tags

In HTTP collections, it might be required to exclude content from sections of an HTML page from being indexed, such as headers, footers, and navigation.

  • Noindex tags
  • Stopindex, Startindex tags
  • Googleon, Googleoff tags

Content Exclusion Metatags Supported

  • noindex tags
    <noindex> Content to Exclude</noindex>
  • stopindex, startindex tags
    <!--stopindex-->Content to Exclude <!--startindex-->
  • googleon,googleoff tags
    <!--googleoff: all-->Content to Exclude<!--googleon: all-->

Rules for noindex,stopindex,google on-off tags

  • Body content that is enclosed with stopindex/startindex tags or noindex tags or googleon/googleoff tags will not be included in the index
  • These tags are not applicable in the head section or meta tags.
  • These tags should not be nested.
  • stopindex should be followed by startindex tags, noindex start tag should be followed by noindex end tag and googleoff tag should be followed by googleon tag.
  • Please check the standards for these tags in Wikipedia or Google.

Updated about a year ago

Using Meta Robots

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.