Skip to content

Jobs API

Monitor and manage background jobs such as document processing, web crawling, and batch operations.


Get Job Status

Retrieve status and details of a background job.

Endpoint: GET /jobs/{id}

Parameters

Parameter Type Location Required Description
id string (UUID) path Yes Job identifier

Example

curl "https://PLATFORM-URL-PLACEHOLDER/v1/api/jobs/{jobId}" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "data": {
    "id": "job_abc123",
    "type": "document_processing",
    "status": "completed",
    "progress": 100,
    "result": {
      "documentsProcessed": 50,
      "errors": 0
    },
    "createdAt": "2025-01-20T10:00:00Z",
    "completedAt": "2025-01-20T10:05:00Z"
  },
  "error": null
}

Job Statuses

Status Description
pending Job is queued and waiting to start
running Job is currently being processed
completed Job finished successfully
failed Job encountered an error
cancelled Job was manually cancelled

Rerun Job

Restart a failed or completed job.

Endpoint: POST /jobs/{id}/rerun

Parameters

Parameter Type Location Required Description
id string (UUID) path Yes Job identifier

Example

curl -X POST "https://PLATFORM-URL-PLACEHOLDER/v1/api/jobs/{jobId}/rerun" \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "data": {
    "id": "job_new123",
    "type": "document_processing",
    "status": "pending",
    "originalJobId": "job_abc123",
    "createdAt": "2025-01-20T12:00:00Z"
  },
  "error": null
}

Job Types

Common job types you may encounter:

Type Description
document_processing Processing uploaded documents
web_crawling Crawling websites for content
batch_embedding Generating embeddings for documents
export Exporting data or reports
import Importing bulk data

Polling for Job Completion

import requests
import time

API_KEY = "your_api_key"
JOB_ID = "job_abc123"
BASE_URL = "https://PLATFORM-URL-PLACEHOLDER/v1/api"

def poll_job(job_id, interval=5, timeout=300):
    """Poll job status until completion or timeout"""
    start_time = time.time()

    while time.time() - start_time < timeout:
        response = requests.get(
            f"{BASE_URL}/jobs/{job_id}",
            headers={"Authorization": f"Bearer {API_KEY}"}
        )
        job = response.json()['data']

        if job['status'] in ['completed', 'failed', 'cancelled']:
            return job

        print(f"Job {job['status']}: {job['progress']}%")
        time.sleep(interval)

    raise TimeoutError("Job polling timed out")

# Usage
job = poll_job(JOB_ID)
print(f"Job {job['status']}: {job['result']}")

Best Practices

  1. Poll Periodically: Check job status every 5-10 seconds, not continuously
  2. Set Timeouts: Implement maximum wait times for long-running jobs
  3. Handle Failures: Check job result for error details
  4. Retry Failed Jobs: Use the rerun endpoint for transient failures
  5. Monitor Progress: Display progress percentage for better UX