Jobs API¶
Monitor and manage background jobs such as document processing, web crawling, and batch operations.
Get Job Status¶
Retrieve status and details of a background job.
Endpoint: GET /jobs/{id}
Parameters¶
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
id | string (UUID) | path | Yes | Job identifier |
Example¶
curl "https://PLATFORM-URL-PLACEHOLDER/v1/api/jobs/{jobId}" \
-H "Authorization: Bearer YOUR_API_KEY"
Response¶
{
"data": {
"id": "job_abc123",
"type": "document_processing",
"status": "completed",
"progress": 100,
"result": {
"documentsProcessed": 50,
"errors": 0
},
"createdAt": "2025-01-20T10:00:00Z",
"completedAt": "2025-01-20T10:05:00Z"
},
"error": null
}
Job Statuses¶
| Status | Description |
|---|---|
pending | Job is queued and waiting to start |
running | Job is currently being processed |
completed | Job finished successfully |
failed | Job encountered an error |
cancelled | Job was manually cancelled |
Rerun Job¶
Restart a failed or completed job.
Endpoint: POST /jobs/{id}/rerun
Parameters¶
| Parameter | Type | Location | Required | Description |
|---|---|---|---|---|
id | string (UUID) | path | Yes | Job identifier |
Example¶
curl -X POST "https://PLATFORM-URL-PLACEHOLDER/v1/api/jobs/{jobId}/rerun" \
-H "Authorization: Bearer YOUR_API_KEY"
Response¶
{
"data": {
"id": "job_new123",
"type": "document_processing",
"status": "pending",
"originalJobId": "job_abc123",
"createdAt": "2025-01-20T12:00:00Z"
},
"error": null
}
Job Types¶
Common job types you may encounter:
| Type | Description |
|---|---|
document_processing | Processing uploaded documents |
web_crawling | Crawling websites for content |
batch_embedding | Generating embeddings for documents |
export | Exporting data or reports |
import | Importing bulk data |
Polling for Job Completion¶
import requests
import time
API_KEY = "your_api_key"
JOB_ID = "job_abc123"
BASE_URL = "https://PLATFORM-URL-PLACEHOLDER/v1/api"
def poll_job(job_id, interval=5, timeout=300):
"""Poll job status until completion or timeout"""
start_time = time.time()
while time.time() - start_time < timeout:
response = requests.get(
f"{BASE_URL}/jobs/{job_id}",
headers={"Authorization": f"Bearer {API_KEY}"}
)
job = response.json()['data']
if job['status'] in ['completed', 'failed', 'cancelled']:
return job
print(f"Job {job['status']}: {job['progress']}%")
time.sleep(interval)
raise TimeoutError("Job polling timed out")
# Usage
job = poll_job(JOB_ID)
print(f"Job {job['status']}: {job['result']}")
Best Practices¶
- Poll Periodically: Check job status every 5-10 seconds, not continuously
- Set Timeouts: Implement maximum wait times for long-running jobs
- Handle Failures: Check job result for error details
- Retry Failed Jobs: Use the rerun endpoint for transient failures
- Monitor Progress: Display progress percentage for better UX
Related¶
- Document API - Document operations that create jobs
- Knowledge - Web crawling creates jobs
- API Overview - Authentication and setup