Skip to main content

Get / Delete Ground-Truth Dataset

Get detail with expected values or delete a ground-truth dataset. Supports GET (read scope) and DELETE (write scope) on the same path.

Retrieve the full details of a ground-truth dataset including all expected value entries, or permanently delete the dataset. The GET response includes every document-field pair with the expected value, which you can use to audit the benchmark data before running a validation.

Call GET before starting a validation run to verify that expected values are correct and complete. The values array contains every document-field pair with its expected_value, document_id, and field_name — review these to ensure the benchmark data reflects your current extraction requirements.

The response includes entry_count for a quick size check and user_schema_id to confirm schema scope. The values array entries each have their own UUID (id) and created_at timestamp. If the dataset is unscoped (user_schema_id: null), it can validate fields across any schema.

Use DELETE only when the dataset is no longer relevant. Existing validation runs that referenced this dataset are retained with their results intact, but you cannot create new runs against a deleted dataset. To update individual entries, delete and recreate the dataset with corrected values.

Deleting a ground-truth dataset also removes all associated expected value entries. Existing validation runs that used this dataset are retained but can no longer be re-run.
GET/v1/validation/ground-truth/{id}

Response

Response fields (GET)

idstringDataset UUID.
namestringDataset name.
user_schema_idstring | nullSchema scope for this dataset, if any.
entry_countinteger | nullNumber of document-field value pairs in the dataset.
created_atstringISO 8601 creation timestamp.
updated_atstringISO 8601 last update timestamp.
linksobjectRelated resource URLs (self).
valuesarrayArray of expected value entries.
values[].idstringEntry UUID.
values[].document_idstringDocument UUID this expected value applies to.
values[].field_namestringField key.
values[].expected_valuestringThe expected (ground-truth) value for this field.
values[].created_atstringISO 8601 timestamp.

Response (GET)

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "name": "Invoice Validation Set",
  "user_schema_id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
  "entry_count": 50,
  "created_at": "2024-08-01T00:00:00.000Z",
  "updated_at": "2024-08-01T00:00:00.000Z",
  "links": {
    "self": "/v1/validation/ground-truth/a1b2c3d4-e5f6-7890-abcd-ef1234567890"
  },
  "values": [
    {
      "id": "c3d4e5f6-a7b8-9012-cdef-123456789012",
      "document_id": "d4e5f6a7-b8c9-0123-defa-234567890123",
      "field_name": "invoice_number",
      "expected_value": "INV-2024-0042",
      "created_at": "2024-08-01T00:00:00.000Z"
    }
  ]
}

Response (DELETE)

{
  "deleted": true
}

Errors

Error responses

401unauthorizedMissing or invalid API key.
404not_foundGround-truth dataset not found or does not belong to your organization.
429rate_limitedToo many requests. Retry after the period indicated in the Retry-After header.