Introduction

Predominantly the data that we have on websites is generally of heterogeneous format and we have ample limitations in pulling it all together to unravel a picture of the content behind it. Each individual might have their own perspective of the content on a page. Sometimes the data might be so diverse that it tends to be pretty hard to ingest using the traditional data ingestion mechanisms.

Pretext NLP has achieved this objective of making the metadata more rich & insightful while performing a search from different perspectives. This pipeline utilizes millions of untapped resources and enriches the “data” into “knowledge” using AI models.

Architechture

Steps to use the PreText NLP Pipeline

  1. Create a web collection for the website content you need to index

👍

NOTE:

Pretext NLP pipeline works only for web collections.

  1. On the MenuBar, click on the Search AI where you can see varied options like PreText, Smart Suggest, etc., Ensure to click on PreText to utilize the pipeline.
  1. Provide the following details to apply the pipeline’s functionality to your collection:
    a. Pretext endpoint as https://pretext.searchblox.com/v1/preprocess/title_summary
    b. The collection for which you need to apply the AI pipeline and enable OCR if required.
    c. Finally click on the “Create” button to create your pretext endpoint.
  1. Start reindexing your collection and make a search to view the AI based summary and title generated for the populated search results by giving &debug=True on the latter end of the search link in the below fashion.
  1. You will be able to see a AI generated title as ml_title and AI generated description as ml_description for the populated search results.

API ERROR CODES

  • We will get Status Code 400, when the input format is not as expected.
  • We will get Status Code 524, when the server is down.

Status Code

Status Message

400

Error parsing the body.

500

Time out error.

Conclusion

Searchblox provides the summary and title of your website content using AI models thereby allowing you to explore your hidden insights and making your content searchable with the least effort.


Did this page help you?