Vector Search¶

How documents are searched and retrieved to power RAG-enabled agent responses.

Overview¶

Vector search is the retrieval mechanism that finds the most relevant document chunks from your knowledge repositories when a user asks a question. It converts the user's query into a vector embedding and searches the vector store for the closest matching document chunks.

Vector search is used by the Retrieval Tool configured in agent settings, and is powered by the Vector Store and Embedding Model assigned to the knowledge repository.

How Vector Search Works¶

User sends a query to the agent
The Retrieval Tool is triggered (automatically or based on LLM decision)
The query is converted into a vector embedding using the repository's configured embedding model
The vector store (Cognitive Search) is searched for document chunks with the most similar embeddings
The top matching chunks are returned as context to the LLM
The LLM generates a response grounded in the retrieved content

Query Types¶

The vector store supports different query types, configured when creating a Vector Store:

Query Type	Description
Vector Search	Pure vector similarity search — finds documents based on semantic meaning

The Semantic Configuration setting on the vector store controls how Azure Cognitive Search applies semantic ranking to improve result relevance.

Document Locks and Access Control¶

During retrieval, document locks are enforced. If a document has lock keywords applied, only users whose roles have matching document keys will receive results from that document. This ensures sensitive content is only accessible to authorized users.

Connecting Vector Search to Agents¶

Vector search is accessed through the Retrieval Tool in agent configuration:

Navigate to Agent Settings > Tools
Add a Retrieval tool
Select the knowledge repository to search
The tool automatically uses the repository's vector store and embedding model

When a user asks a question, the LLM decides whether to invoke the Retrieval tool based on the tool's description and the query context.

RAG Overview — End-to-end RAG pipeline
Document Ingestion — Add documents to repositories
Web Crawling — Crawl external sources
Vector Stores — Configure the underlying vector database
Embeddings — Configure embedding models
Tools — Retrieval tool configuration
Back to Knowledge