Using A Large Language Model (LLM)
SearchBlox Enterprise Search integrates Large Language Model (LLM) capabilities to automatically enrich document metadata, including:
- Titles - Context-aware document naming
- Descriptions - Concise content summaries
- Keywords - Optimized search terms
This AI-driven enhancement applies to all documents within a collection, improving both search accuracy and content discoverability.
Prerequisites:
- Download and install SearchBlox Service.
- Set-up the configurations by creating a collection and indexing documents.
- Install Python 3.11.0 or Python 3.12.4.
Note: For production deployments with large document volumes, consider dedicated AI processing servers to ensure optimal performance.
Technical Guide:
SearchBlox leverages advanced LLM technology to automatically enhance document metadata, improving search relevance and content discoverability.
Core Technology Stack
- LLM Model: Llama-2-based architecture
- Search Engine: OpenSearch backend integration
- Data Format: JSON payloads for metadata transmission
- Retrieving and Processing: The model retrieves all indexed documents directly from OpenSearch, the underlying search engine within SearchBlox.
- Generating Enhanced Metadata:
- Collect content from each document of a Collection
- Pass the content through the LLM with a prompt.
- The LLM processes the collected data and generates relevant titles, descriptions, and up to 20 keyword
- Seamless Integration:
- The generated metadata is sent back to Opensearch in JSON format.
- It is updated within the document's metadata using document's
uid
.- Improvements are immediately reflected in the search results.
Generate Titles, Descriptions and Keywords using LLM
Using a Large Language Models (LLMs), we can now create titles, descriptions and keywords for any document.
NOTE:
- Large Language Models (LLMs) improves the relevance of the search results, even with the poor quality metadata.
- The new titles, descriptions, and keywords are generated systematically using LLMs, making them more comprehensive.
To generate titles, descriptions and keywords for Text/Images based on Llama-2-based model
click on the following links:
Updated about 1 month ago