Skip to main content

List Benchmarks

List benchmark runs that compare extraction results against ground truth datasets. Each run produces per-field accuracy metrics.

GET/v1/quality/benchmarks

Response

Response fields

dataarrayArray of benchmark run objects.
data[].idstringBenchmark run UUID.
data[].namestringBenchmark run name.
data[].dataset_idstringGround truth dataset ID used for this run.
data[].user_schema_idstring | nullUser schema scoping this benchmark, if any.
data[].statusstringRun status: queued, running, completed, or failed.
data[].accuracy_overallnumber | nullOverall accuracy score (0–1). Null while running.
data[].accuracy_by_fieldobject | nullPer-field accuracy scores. Null while running.
data[].documents_processedintegerNumber of documents evaluated so far.
data[].documents_totalintegerTotal documents to evaluate.
data[].duration_msinteger | nullTotal run duration in milliseconds.
data[].created_atstringISO 8601 creation timestamp.
data[].completed_atstring | nullISO 8601 completion timestamp.
data[].links.selfstringURL to this benchmark run.
data[].links.resultsstringURL to the per-document results.
pagination.totalintegerTotal number of benchmark runs.
pagination.limitintegerMaximum results per page.
pagination.has_morebooleanWhether more results exist beyond this page.
pagination.next_cursorstring | nullCursor to fetch the next page.

Response

{
  "data": [
    {
      "id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
      "name": "Benchmark 2024-09-25",
      "dataset_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "user_schema_id": null,
      "status": "completed",
      "accuracy_overall": 0.93,
      "accuracy_by_field": {
        "vendor_name": 0.98,
        "total_amount": 0.90,
        "invoice_number": 0.92
      },
      "documents_processed": 50,
      "documents_total": 50,
      "duration_ms": 4200,
      "accuracy_delta": null,
      "compared_to_run_id": null,
      "created_at": "2024-09-25T12:00:00.000Z",
      "completed_at": "2024-09-25T12:00:04.200Z",
      "links": {
        "self": "/v1/quality/benchmarks/c3d4e5f6-a7b8-9012-cdef-123456789012",
        "results": "/v1/quality/benchmarks/c3d4e5f6-a7b8-9012-cdef-123456789012/results"
      }
    }
  ],
  "pagination": {
    "total": 5,
    "limit": 20,
    "has_more": false,
    "next_cursor": null
  }
}

Errors

Error responses

401unauthorizedMissing or invalid API key.
429rate_limitedToo many requests. Retry after the period indicated in the Retry-After header.