Using A Large Language Model (LLM)

SearchBlox Enterprise Search integrates Large Language Model (LLM) capabilities to automatically enrich document metadata, including:

  • Titles - Context-aware document naming
  • Descriptions - Concise content summaries
  • Keywords - Optimized search terms

This AI-driven enhancement applies to all documents within a collection, improving both search accuracy and content discoverability.

🚧

Prerequisites:

Note: For production deployments with large document volumes, consider dedicated AI processing servers to ensure optimal performance.

📘

Technical Guide:

SearchBlox leverages advanced LLM technology to automatically enhance document metadata, improving search relevance and content discoverability.

Core Technology Stack

  1. LLM Model: Llama-2-based architecture
  2. Search Engine: OpenSearch backend integration
  3. Data Format: JSON payloads for metadata transmission
  • Retrieving and Processing: The model retrieves all indexed documents directly from OpenSearch, the underlying search engine within SearchBlox.
  • Generating Enhanced Metadata:
    • Collect content from each document of a Collection
    • Pass the content through the LLM with a prompt.
    • The LLM processes the collected data and generates relevant titles, descriptions, and up to 20 keyword
  • Seamless Integration:
    • The generated metadata is sent back to Opensearch in JSON format.
    • It is updated within the document's metadata using document's uid.
    • Improvements are immediately reflected in the search results.

Generate Titles, Descriptions and Keywords using LLM

Using a Large Language Models (LLMs), we can now create titles, descriptions and keywords for any document.

📘

NOTE:

  • Large Language Models (LLMs) improves the relevance of the search results, even with the poor quality metadata.
  • The new titles, descriptions, and keywords are generated systematically using LLMs, making them more comprehensive.

To generate titles, descriptions and keywords for Text/Images based on Llama-2-based model click on the following links: