Skip to content

Vector Search

How documents are searched and retrieved to power RAG-enabled agent responses.


Overview

Vector search is the retrieval mechanism that finds the most relevant document chunks from your knowledge repositories when a user asks a question. It converts the user's query into a vector embedding and searches the vector store for the closest matching document chunks.

Vector search is used by the Retrieval Tool configured in agent settings, and is powered by the Vector Store and Embedding Model assigned to the knowledge repository.


How Vector Search Works

  1. User sends a query to the agent
  2. The Retrieval Tool is triggered (automatically or based on LLM decision)
  3. The query is converted into a vector embedding using the repository's configured embedding model
  4. The vector store (Cognitive Search) is searched for document chunks with the most similar embeddings
  5. The top matching chunks are returned as context to the LLM
  6. The LLM generates a response grounded in the retrieved content

Query Types

The vector store supports different query types, configured when creating a Vector Store:

Query Type Description
Vector Search Pure vector similarity search — finds documents based on semantic meaning

The Semantic Configuration setting on the vector store controls how Azure Cognitive Search applies semantic ranking to improve result relevance.


Document Locks and Access Control

During retrieval, document locks are enforced. If a document has lock keywords applied, only users whose roles have matching document keys will receive results from that document. This ensures sensitive content is only accessible to authorized users.


Connecting Vector Search to Agents

Vector search is accessed through the Retrieval Tool in agent configuration:

  1. Navigate to Agent Settings > Tools
  2. Add a Retrieval tool
  3. Select the knowledge repository to search
  4. The tool automatically uses the repository's vector store and embedding model

When a user asks a question, the LLM decides whether to invoke the Retrieval tool based on the tool's description and the query context.