Skip to main content

Run Reconciliation

Run reconciliation: anchor each target document in a reference dataset and execute the validation checklist, returning per-document results and a summary.

Run reconciliation over a set of documents or a structuring run. Each document is anchored to its reference row using the lookup fields and key columns in the config, and then the validation checklist runs against that row. The response returns one result per document plus a roll-up summary, all in a single synchronous response.

The config is supplied inline in the request body, separate from any config stored on the dataset. lookup_fields are the extracted fields whose values seed the anchor search; reference_key_columns are the columns to index (leave empty to auto-detect); checks is the validation checklist run after anchoring. As with auto-configure, target_value is polymorphic: { "document_ids": [...] } for documents, { "run_id": "..." } for a run.

Each result carries a status: reconciled when an anchor was found and all required checks pass, partial when an anchor was found but some checks fail, unmatched when no anchor was found, and ambiguous when multiple reference rows matched the same key. The summary counts each status so you can gauge overall data quality at a glance.

This is the governed, non-streaming run. It returns the full result set in one response. The internal SSE streaming variant is intentionally omitted from the public API.
POST/v1/reconciliation/run

Response

Response fields

resultsarrayOne result per target document.
results[].document_idstringSource document UUID.
results[].filenamestringSource document filename.
results[].statusstringreconciled, partial, unmatched, or ambiguous.
results[].anchorobject | nullThe matched reference row, or null when unmatched.
results[].anchor_candidatesarrayAll candidate anchors (more than one means ambiguous).
results[].checksarrayPer-check results with pass/fail/skip status.
results[].pass_countintegerNumber of checks that passed.
results[].fail_countintegerNumber of checks that failed.
results[].skip_countintegerNumber of checks that were skipped.
summary.totalintegerTotal documents reconciled.
summary.reconciledintegerDocuments fully reconciled.
summary.partialintegerDocuments anchored with some failing checks.
summary.unmatchedintegerDocuments with no anchor.
summary.ambiguousintegerDocuments with multiple anchor candidates.
column_classificationobject | nullReference column classification, when key columns were auto-detected.
token_index_sizeintegerNumber of entries in the anchor token index.

Request body

{
  "reference_data_id": "rd_a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "target_type": "documents",
  "target_value": {
    "document_ids": ["doc_uuid_1", "doc_uuid_2"]
  },
  "config": {
    "lookup_fields": ["booking_reference"],
    "reference_key_columns": ["booking_ref"],
    "checks": [
      {
        "name": "freight check",
        "type": "numeric_tolerance",
        "extracted_fields": ["freight_total"],
        "reference_fields": ["freight_amount"],
        "tolerance": 0.02,
        "required": true
      }
    ]
  }
}

Response

{
  "results": [
    {
      "document_id": "doc_uuid_1",
      "filename": "carrier_invoice_001.pdf",
      "status": "reconciled",
      "anchor": {
        "reference_row_id": "ref_row_uuid_1",
        "reference_row_index": 42,
        "matched_value": "ABC1234567",
        "matched_via_field": "booking_reference",
        "matched_ref_column": "booking_ref",
        "original_extracted_value": "ABC-1234567",
        "original_reference_value": "ABC1234567"
      },
      "anchor_candidates": [],
      "checks": [
        {
          "name": "freight check",
          "type": "numeric_tolerance",
          "status": "pass",
          "extracted_field": "freight_total",
          "extracted_value": 4250.0,
          "reference_field": "freight_amount",
          "reference_value": 4250.0
        }
      ],
      "pass_count": 1,
      "fail_count": 0,
      "skip_count": 0
    },
    {
      "document_id": "doc_uuid_2",
      "filename": "carrier_invoice_002.pdf",
      "status": "unmatched",
      "anchor": null,
      "anchor_candidates": [],
      "checks": [],
      "pass_count": 0,
      "fail_count": 0,
      "skip_count": 0
    }
  ],
  "summary": {
    "total": 2,
    "reconciled": 1,
    "partial": 0,
    "unmatched": 1,
    "ambiguous": 0
  },
  "column_classification": null,
  "token_index_size": 1842
}

Documents with partial status are anchored but failed one or more required checks: inspect the per-check status, extracted_value, and reference_value to see exactly where the document diverged from the reference row. Documents with ambiguous status carry more than one entry in anchor_candidates, which usually means your key column is not unique enough. Add a narrowing column to disambiguate.

Errors

Error responses

400bad_requestInvalid body, the dataset is not ready, no key columns could be detected, or no documents with extracted data were found.
401unauthorizedMissing or invalid API key.
404not_foundNo reference dataset with this ID exists for your organization.
429rate_limitedToo many requests. Retry after the period indicated in the Retry-After header.