Skip to main content

Validation Checkpoints

A validation checkpoint is a gate you place at a position in the rail. It validates the cumulative output of whatever stages precede it, so a checkpoint after Extraction judges raw extraction while a checkpoint after Resolution judges normalized values. You can place several checkpoints in one rail, each scoped to the work before it. A checkpoint has no position of its own: where it runs is decided entirely by where you drop it in the rail.

Each checkpoint runs up to four validation mechanisms over the fields in scope. Evidence applies deterministic rules that check a value against the source text. Business Rules uses a fast model to judge field-level and cross-field plausibility. N-shot re-runs extraction several times and compares the results to measure reproducibility. LLM Judge asks a model to assess correctness. The verdicts are stored per field, and they drive both the review queue and the data product holdback.

A checkpoint can be a warning (record the verdict, ship the value) or blocking (hold failing fields for review). A blocking gate writes a pending-approval cell for each failing field, flips the field state, and routes it to the central review queue or holds it on the pipeline page, according to the gate config. Dependent fields wait until the blocked field is resolved. The verdicts are stored, so a gate flipped from warning to blocking after the fact can be replayed over existing results without re-validating anything.

N-shot is a reproducibility check, not a second opinion. It re-runs the real extraction on the same document N times and compares each run to the value first committed. A field that comes back identical across runs is stable; a field that varies is flagged. When the checkpoint covers resolution, n-shot compares the resolved values, so a difference that normalization erases (1.000,50 versus 1000.5) reads as agreement.

Checkpoints are built from reusable validation stages (gates) configured through the validation-stages endpoints on the schema. A gate carries its severity, field scope, and blocking behavior; the rail decides where it runs. The same gate can appear at more than one checkpoint. An optional "covers phases" control scopes a checkpoint to fields produced by specific preceding phases, so a late checkpoint can validate only the resolution output rather than re-judging everything.

Create a validation stage on a schema
curl -X POST https://api.talonic.com/v1/schemas/sch_delivery_notes/validation-stages \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Required fields present",
    "severity": "blocking",
    "fieldScope": "subset",
    "fieldKeys": ["delivery_date", "total_amount"]
  }'

Reference the stage from a valid rail stage by its ID, and it becomes a checkpoint at that position. Because verdicts are stored, you can tune a gate over time. Run it as a warning first to see what it would catch, then flip it to blocking and replay it over the documents you already processed to push the failures into review, with no re-run and no model calls.

Validation mechanisms fail closed. A validator that cannot run (a model outage, missing reference data, an n-shot re-extraction that returns nothing) emits fail verdicts rather than passing silently. A blocking gate will hold those fields for review, so an infrastructure problem never lets an unchecked value through.

Frequently asked questions

Where in the pipeline does a validation checkpoint run?+
Wherever you place it in the rail. A checkpoint validates the cumulative cell values produced by the stages before it, so a checkpoint after Extraction judges raw extraction and one after Resolution judges normalized values. You can place several checkpoints in a single rail, each scoped to the work before it.
What checks does a checkpoint run?+
Up to four mechanisms: Evidence (deterministic rules against source text), Business Rules (model-judged field and cross-field plausibility), N-shot (re-runs extraction to measure reproducibility), and LLM Judge (model assessment of correctness). Verdicts are stored per field and drive the review queue and data-product holdback.
What does a blocking gate do to a failing field?+
It writes a pending-approval cell copying the latest value, flips the field state to blocked, and routes it to the central review queue or holds it on the pipeline page per the gate config. Dependent fields wait. The value never ships until a reviewer approves, corrects, or overrides it.
Can I make a gate blocking after a run already happened?+
Yes. Because verdicts are stored, flipping a gate from warning to blocking lets you replay it over existing results: the stored verdicts are pushed through the same block-selection the live path uses, with no re-validation and no model calls. A dry-run previews the blast radius before you apply it.