Skip to main content

Run Pipeline

Run a configured Spec through the One Engine pipeline. The server compiles the schema's saved rail into phase config, attaches documents, and starts processing.

A pipeline is one run of a configured Spec over a set of documents. A Spec is a user_schema with a composed rail: Source → Field Registry → Extraction → Resolution → Validation → Data Product. You name the Spec by its schema_id and pass the documents to run. The server does the rest: it compiles the Spec's saved rail into the pipeline's phase config, so you never hand-build phases.

Compilation is server-side and governed. The rail expands into one resolution phase per active policy, one validation phase per member gate at its checkpoint position, an enforced extraction phase, and a pipeline-scoped assembly step appended after all documents finish. The run is created, the documents are attached, and processing starts immediately. The response status is active.

This is the governed tier. For a quick one-off structuring run that does not justify configuring a Spec, use POST /v1/jobs instead: jobs run the standard 4-phase pipeline against a schema with no saved rail, no policy-driven resolution phases, and no validation checkpoints. Pipelines are the right choice when you have curated a Spec with resolution policies and positional validation gates and want review holdback enforced.

A pipeline's row results are read through the data product it produces. After the run finishes, call POST /v1/pipelines/{id}/data-product and then read rows through the data-products endpoints. The data product is also where per-cell review holdback is enforced: fields a blocking gate parked for review surface with status only until a reviewer resolves them. This endpoint requires an API key with the write scope.

If the named Spec has no composed rail, the request fails with 400 bad_request. Configure the Spec in the app first, or use POST /v1/jobs for an ad-hoc run that needs no saved rail.
POST/v1/pipelines

Response

Response fields (201 Created)

idstringPipeline run UUID.
statusstringAlways "active" immediately after creation.
schemaobjectThe Spec used for this run: { id, name }.
document_countintegerNumber of documents requested in document_ids.
enqueued_documentsintegerNumber of documents actually enqueued for processing.
messagestringHuman-readable confirmation message.
links.selfstringURL to fetch the pipeline run.
links.progressstringURL to poll phase-by-phase progress.
links.data_productstringURL to produce a data product once the run finishes.

Response (201 Created)

{
  "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "active",
  "schema": { "id": "sch_uuid_1", "name": "Lease Agreement" },
  "document_count": 24,
  "enqueued_documents": 24,
  "message": "Pipeline created and queued for processing.",
  "links": {
    "self": "/v1/pipelines/a1b2c3d4-e5f6-7890-abcd-ef1234567890",
    "progress": "/v1/pipelines/a1b2c3d4-e5f6-7890-abcd-ef1234567890/progress",
    "data_product": "/v1/pipelines/a1b2c3d4-e5f6-7890-abcd-ef1234567890/data-product"
  }
}

Errors

Error responses

400bad_requestThe Spec has no composed pipeline rail, the rail compiled to zero executable phases, or the request body failed validation.
401unauthorizedMissing or invalid API key.
404not_foundNo Spec (schema) with this schema_id exists for your organization.
429rate_limitedToo many requests. Retry after the period indicated in the Retry-After header.