List Benchmarks
List benchmark runs that compare extraction results against ground truth datasets. Each run produces per-field accuracy metrics.
Benchmark runs compare your extraction output against ground truth datasets to produce per-field accuracy scores. Each run evaluates every document in the dataset and produces an accuracy_overall score along with per-field breakdowns. Use benchmarks to track extraction quality over time and measure the impact of schema or pipeline changes.
Use this endpoint to see all benchmark runs and their accuracy scores. A typical workflow is to list benchmarks after making schema or pipeline changes, then compare the latest run against previous ones using GET /v1/quality/benchmarks/compare to measure improvement or detect regressions.
Each benchmark includes status (queued, running, completed, or failed), accuracy_overall (0-1 score, null while running), accuracy_by_field (per-field breakdown), and documents_processed/documents_total for progress tracking. The accuracy_delta and compared_to_run_id fields support cross-run comparisons.
Run benchmarks regularly after extraction pipeline changes. Pair with GET /v1/quality/benchmarks/:id/results for per-document drill-down showing which fields matched and which diverged. Use the compare endpoint to track accuracy trends across multiple runs.
/v1/quality/benchmarksResponse
Response fields
Response
{
"data": [
{
"id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
"name": "Benchmark 2024-09-25",
"dataset_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"user_schema_id": null,
"status": "completed",
"accuracy_overall": 0.93,
"accuracy_by_field": {
"vendor_name": 0.98,
"total_amount": 0.90,
"invoice_number": 0.92
},
"documents_processed": 50,
"documents_total": 50,
"duration_ms": 4200,
"accuracy_delta": null,
"compared_to_run_id": null,
"created_at": "2024-09-25T12:00:00.000Z",
"completed_at": "2024-09-25T12:00:04.200Z",
"links": {
"self": "/v1/quality/benchmarks/c3d4e5f6-a7b8-9012-cdef-123456789012",
"results": "/v1/quality/benchmarks/c3d4e5f6-a7b8-9012-cdef-123456789012/results"
}
}
],
"pagination": {
"total": 5,
"limit": 20,
"has_more": false,
"next_cursor": null
}
}Errors
Error responses