## **Using noindex,nofollow Meta Robots**

WEB Collection crawler can be controlled using Meta robots tags.

  • Rules for crawling based on meta robots would be considered after robots.txt and path settings specified in WEB collections.

  • Using meta robots

    • User can avoid indexing a page, but crawl the same.

    • User can index a page, but avoid crawling the same.

    • User can avoid indexing as well as crawling the page.

    • User can index and crawl the page.

### Meta Robots

These meta tags which direct the indexing and crawling of a page has to be specified in the header section of HTML code.

**noindex, follow**

  • To avoid indexing a page but allow crawling the following meta tag has to be specified in the page that is crawled: `<meta name="robots" content="noindex, follow">`

**index, nofollow**

  • To avoid crawling but allow indexing the following meta tag has to be specified: `<meta name="robots" content="index, nofollow">`

**noindex, nofollow**

  • To avoid both indexing and crawling the following meta tag has to be provided: `<meta name="robots" content="noindex, nofollow">`

**index, follow**

  • To index as well as crawl either you can avoid giving the meta robots or specify the same as shown: `<meta name="robots" content="index, follow">`

## **Using Content Exclusion Meta Tags**

In HTTP collections, it might be required to exclude content from sections of an HTML page from being indexed, such as headers, footers, and navigation.

  • Noindex tags

  • Stopindex, Startindex tags

  • Googleon, Googleoff tags

### Content Exclusion Metatags Supported

  • noindex tags `<noindex> Content to Exclude</noindex>`

  • stopindex, startindex tags `<!--stopindex-->Content to Exclude <!--startindex-->`

  • googleon,googleoff tags `<!--googleoff: all-->Content to Exclude<!--googleon: all-->`

### Rules for noindex,stopindex,google on-off tags

  • Body content that is enclosed with stopindex/startindex tags or noindex tags or googleon/googleoff tags will not be included in the index

  • These tags are not applicable in the head section or meta tags.

  • These tags should not be nested.

  • stopindex should be followed by startindex tags, noindex start tag should be followed by noindex end tag and googleoff tag should be followed by googleon tag.

  • Please check the standards for these tags in Wikipedia or Google.