Document Ingestion¶

Upload and manage documents in your knowledge repositories to power RAG-enabled AI agents.

Overview¶

The Knowledge Repository system allows you to upload various document types that your AI agents can use to answer questions and provide information. Documents are processed, indexed, and made searchable for intelligent retrieval.

Accessing Knowledge Repositories¶

Step 1: Navigate to Knowledge¶

Click on KNOWLEDGE in the top navigation menu
You will see a list of configured repositories

Knowledge Navigation Knowledge section with list of configured repositories

Viewing Repository Documents¶

Step 1: Access Repository¶

Locate your repository in the list (e.g., "Product Documentation Repository")
Click on the Documents link to view all documents in that repository

Document List View¶

The documents list displays:

Document ID: Unique identifier for the document
Document Type: File type (PDF, DOCX, CSV, TXT, etc.)
Document Name: Name of the uploaded file
Modified Date: Last modification timestamp
Status: Processing status (Ready, Processing, Failed)

Repository Documents List of documents within a knowledge repository

Adding New Documents¶

Step 1: Access Upload Options¶

Click the Add Document(s) button in the top-left corner of the documents page
A modal dialog will appear with different upload options

Add Documents Button Add Document(s) button in the documents view

Step 2: Select Upload Method¶

The platform offers several upload methods:

Upload Options¶

Method	Description	Use When
Folder (Auto OCR)	Automatically detects need for OCR	Uploading scanned PDFs or mixed content
Folder (No OCR)	Upload without OCR processing	All documents are digital/searchable
Folder (Force OCR)	Forces OCR on all documents	All documents need OCR
Plain Text	Upload raw text data	Uploading text-only content

Crawl Options¶

Method	Description	Use When
Confluence	Crawl Confluence data	Syncing from Confluence
Website	Crawl and extract website content	Importing web pages
SharePoint/OneDrive	Crawl Microsoft document libraries	Syncing from Microsoft 365
Human Assisted	Crawl with browser plugin	Sites requiring authentication

API Option¶

Method	Description	Use When
API	Add documents via API	Programmatic integration

Recommended: Folder (Auto OCR)¶

Choose Folder (Auto OCR) from the Upload section. This option:

✅ Automatically detects the need for OCR
✅ Supports multiple file formats (.pdf, .docx, .csv, .txt)
✅ Can handle up to 1000 files per upload
✅ Intelligent processing based on document type

Upload Method Selection Select Folder (Auto OCR) for intelligent document processing

Step 3: Upload Files¶

You can upload files in two ways:

Option 1: Drag and Drop¶

Drag file(s) directly into the upload area
Visual feedback when files are over the drop zone
Supports multiple files at once

Option 2: Browse¶

Click in the upload area to open a file browser
Select one or more files
Click "Open" to add them to the upload queue

File Upload Interface Drag and drop or browse to select files

Supported formats: - .pdf - PDF documents - .docx - Microsoft Word documents - .csv - Comma-separated values - .txt - Plain text files

Step 4: Configure Document Locks (Optional)¶

Document Locks allow you to restrict document visibility based on user roles.

How It Works¶

Enter keywords in the "Document Locks" field
Press Enter to add multiple keywords
Only users with roles matching these keywords can see results from this document in search/retrieval

Use Cases¶

Confidential Documents: Lock to "executive", "finance"
Department-Specific: Lock to "sales", "marketing"
Role-Based Access: Lock to specific role names

Document Locks Configuration Configure document locks for role-based access control

Step 5: Save Documents¶

Review your file selections
Click the Save button to complete the upload
Documents will be processed and added to the repository
The status will show as "Ready" once processing is complete

Processing Statuses¶

Status Types¶

Status	Description	What It Means
Processing	Document is being ingested	Wait for processing to complete
Ready	Document is indexed and searchable	Available for agent retrieval
Failed	Processing encountered an error	Check document format or size

Processing Time¶

Small files (< 1 MB): Usually under 1 minute
Medium files (1-10 MB): 1-5 minutes
Large files (10-50 MB): 5-15 minutes
Bulk uploads: Processed in parallel, monitor status

Important Notes¶

OCR Processing¶

Auto OCR automatically detects whether OCR is needed for uploaded files
Files containing both OCR and non-OCR data may result in data loss
Use "Force OCR" only if all documents require OCR

Upload Limits¶

Maximum upload limit: 1000 files per upload
Supported formats: .pdf, .docx, .csv, .txt files
File size limit: 50 MB per file (varies by plan)

Best Practices¶

Organize Before Uploading: Group related documents
Use Consistent Naming: Makes documents easier to find
Clean Content: Remove unnecessary pages or sections
Test with Sample: Upload a few files first to verify processing
Monitor Status: Check processing status for errors

Crawl Options¶

Confluence¶

Integrate with Confluence to automatically sync spaces and pages.

Requirements: - Confluence URL - API token or credentials - Appropriate permissions

What Gets Crawled: - Pages and sub-pages - Attachments - Comments (optional)

Website Crawling¶

Crawl public websites to extract content.

Configuration: - Starting URL - Crawl depth - Include/exclude patterns

Use Cases: - Competitor documentation - Public knowledge bases - Blog content

Learn more about Web Crawling

SharePoint/OneDrive¶

Sync document libraries from Microsoft 365.

Requirements: - Microsoft Graph API credentials - Library/folder URLs - Proper permissions

Supported Content: - Documents - Folders - Metadata

Managing Documents¶

Viewing Document Details¶

Click on a document name in the list
View metadata, processing status, and content preview

Editing Documents¶

Click the Edit icon next to a document
Update document locks or metadata
Save changes

Deleting Documents¶

Click the Delete icon next to a document
Confirm deletion
Document is removed from the repository and search index

Deletion is Permanent

Deleted documents cannot be recovered and will no longer be available for agent retrieval.

Troubleshooting¶

Document Status Stuck on "Processing"¶

Issue: Document remains in "Processing" status for extended time

Solution: - Wait at least 15 minutes for large files - Refresh the page to check for status update - If stuck for over 30 minutes, delete and re-upload - Contact support if issue persists

Upload Failed Error¶

Issue: File upload returns an error

Solution: - Check file size (must be under 50 MB) - Verify file format is supported (.pdf, .docx, .csv, .txt) - Ensure file is not corrupted - Try uploading files individually instead of in bulk

Documents Not Appearing in Agent Responses¶

Issue: Uploaded documents are not being used by agents

Solution: - Verify document status is "Ready" - Check that agent has the repository configured in Tools section - Ensure document locks don't restrict access - Test agent with specific questions related to document content

RAG Overview - Understanding Retrieval-Augmented Generation
Vector Search - How documents are searched and retrieved
Web Crawling - Crawl websites for content
Agent Builder - Add knowledge tools to your agents
Document API - Programmatic document management

Document Ingestion¶

Overview¶

Accessing Knowledge Repositories¶

Step 1: Navigate to Knowledge¶

Viewing Repository Documents¶

Step 1: Access Repository¶

Document List View¶

Adding New Documents¶

Step 1: Access Upload Options¶

Step 2: Select Upload Method¶

Upload Options¶

Crawl Options¶

API Option¶

Recommended: Folder (Auto OCR)¶

Step 3: Upload Files¶

Option 1: Drag and Drop¶

Option 2: Browse¶

Step 4: Configure Document Locks (Optional)¶

How It Works¶

Use Cases¶

Step 5: Save Documents¶

Processing Statuses¶

Status Types¶

Processing Time¶

Important Notes¶

OCR Processing¶

Upload Limits¶

Best Practices¶

Crawl Options¶

Confluence¶

Website Crawling¶

SharePoint/OneDrive¶

Managing Documents¶

Viewing Document Details¶

Editing Documents¶

Deleting Documents¶

Troubleshooting¶

Document Status Stuck on "Processing"¶

Upload Failed Error¶

Documents Not Appearing in Agent Responses¶

Related Topics¶