Skip to main content

Monitoring Batches

The Batches page at /sources/batches shows the status of all batch jobs with real-time updates. Each batch progresses through three states: accumulating (items collecting in the queue), submitted (sent to the provider's batch API), and completed (results received and applied to the corresponding documents). The page live-syncs with the provider so you can monitor progress without manual refreshing. Click any batch to see the detail view with individual items, their processing state, and any errors.

Batches are submitted automatically when the accumulation timer fires (every 15 minutes by default) or when the item count threshold is reached, whichever comes first. These intervals are configurable in the pipeline settings. Once submitted, the platform polls the provider hourly to check for completion. When results arrive, they are applied to the corresponding documents — including field resolution, linking, triage, and delivery events — and the batch transitions to completed status.

The batch detail view shows individual items within a batch, including which documents are included, their current processing state, and any errors that occurred. Use this view to verify that a specific document was included in the expected batch and to troubleshoot items that failed to parse.

For example, after uploading 500 invoices in batch mode, navigate to /sources/batches to check progress. You will see a batch in accumulating status collecting items until the 15-minute timer fires. Once submitted, the status changes to submitted and the platform polls the provider hourly. Click the batch row to see each document's individual state — if 3 items show parse errors, those documents were automatically retried via the real-time path while the remaining 497 completed normally. When the batch transitions to completed, all results have been applied and documents are ready for review.

The platform includes built-in crash recovery for batch processing. If the application restarts while a batch is in a transient processing state, the recovery logic automatically reverts it to submitted so the next polling cycle can retry. This means batch jobs are resilient to infrastructure disruptions without requiring manual intervention.

Batch statuses

ParameterTypeDescription
AccumulatingstatusItems are being collected. The batch has not yet been submitted to the provider.
SubmittedstatusThe batch has been sent to the provider. Polled hourly for completion.
CompletedstatusAll results have been received and applied to the corresponding documents.
Monitor batch progress via API
# List all batches with their statuses:
curl -s https://api.talonic.com/v1/batches \
  -H "Authorization: Bearer $TALONIC_API_KEY"

# Get detail for a specific batch including item states:
curl -s https://api.talonic.com/v1/batches/batch_abc \
  -H "Authorization: Bearer $TALONIC_API_KEY"

# Response:
# {
#   "id": "batch_abc",
#   "status": "submitted",
#   "item_count": 150,
#   "submitted_at": "2025-04-22T10:15:00Z",
#   "provider": "anthropic",
#   "items": [
#     { "document_id": "doc_001", "status": "pending" },
#     { "document_id": "doc_002", "status": "completed" },
#     { "document_id": "doc_003", "status": "parse_error", "retried_realtime": true }
#   ]
# }

The batch detail view is your primary tool for diagnosing issues with batch processing. Each item shows its individual status — pending, completed, or parse_error with a retried_realtime flag indicating whether the system automatically retried it through the real-time path. Items with parse errors are retried exactly once through the real-time extraction path, ensuring the 48-hour SLA is maintained. If a batch has an unusually high parse error rate, this may indicate a problem with the documents themselves (corrupt files, unusual formatting) rather than a system issue. The crash recovery mechanism ensures that infrastructure disruptions — application restarts, memory pressure, or network interruptions — do not leave batches in a permanently stuck state.

If a batch gets stuck in "processing" due to an unexpected interruption, the platform automatically recovers it on startup. Batches stuck for more than 15 minutes are reverted to "submitted" so the next poll cycle retries them.

Frequently asked questions

Where can I monitor batch jobs?+
Navigate to /sources/batches to see the status of all batch jobs. The page live-syncs with the provider for real-time status updates.
What are the batch statuses?+
Three statuses: Accumulating (items collecting), Submitted (sent to provider, polled hourly), and Completed (results received and applied).
How often are batches submitted to the provider?+
Batches are submitted on a 15-minute timer or when the item count threshold is reached, whichever comes first. These intervals are configurable in the pipeline settings.
What happens if a batch gets stuck?+
The platform includes crash recovery logic. Batches stuck in "processing" for more than 15 minutes are automatically reverted to "submitted" so the next poll cycle retries them. No manual intervention is needed.
How do I check the status of a specific document in a batch?+
Use GET /v1/batches/{id} to see the batch detail view, which lists every item with its individual status (pending, completed, or parse_error). You can also check the document directly via GET /v1/documents/{id} — batch-queued documents show status batch_queued until results are applied, then transition to their final status.