Batch Processing Mode
Set processing_mode=batch on upload (API) or toggle the "Batch" switch in the upload UI. Stage 1 (OCR + classification) runs immediately so documents appear in your library right away with their type classification and triage metadata. Stage 2 (Claude extraction) is deferred to the provider's batch API for asynchronous processing. While waiting for batch results, documents show a status of batch_queued in your library. The system requires a minimum of 100 items per batch — if fewer documents are uploaded in batch mode, the system falls back to real-time processing with a warning.
The two-stage architecture means you get immediate feedback on what was uploaded. Documents are OCR'd, classified by type, and triaged within seconds. Only the AI extraction step — where Claude reads the document and fills structured fields — is deferred to the batch queue for cost savings.
Batch stages
| Parameter | Type | Description |
|---|---|---|
| Stage 1 | immediate | OCR, classification, and triage run in real-time. Documents are visible in your library immediately. |
| Stage 2 | deferred | Claude extraction is queued for batch processing. Items accumulate, then submit to the batch API on a timer or threshold. |
While waiting for batch results, documents show a status of batch_queued in your library. Once the provider returns results, the platform applies them through the same post-processing pipeline as real-time extraction — including markdown pre-processing, field parsing, quality metrics, and extraction metadata computation. If a batch extraction fails to parse, the affected document is retried through the real-time extraction path rather than as a new batch, ensuring the original 48-hour SLA is maintained.
You can also enable batch mode on a per-source basis. When a source connection has the batch processing toggle enabled, all documents ingested through that source are automatically routed to the batch queue. This is ideal for source connections that handle non-urgent, high-volume ingestion — such as a shared drive that collects documents overnight.
- Included in batch: Stage 2 Claude extraction, markdown pre-processing, field parsing, quality metrics computation, extraction metadata, and all post-processing that does not require LLM calls.
- Excluded from batch: LLM-based quality passes (field estimation, verification, cross-reference enrichment) are skipped to preserve cost savings.
- Excluded from batch: Image-only documents (PNG, JPG) are automatically routed to real-time processing because the batch payload is text-only.
- Fallback behavior: Parse failures in batch mode are retried through the real-time extraction path — never as a new batch — to maintain the 48-hour SLA.
- Minimum threshold: Batches require at least 100 items (a provider requirement). Uploads below this threshold fall back to real-time processing with a warning.
# Toggle batch mode for all documents from a Google Drive source:
curl -X PATCH https://api.talonic.com/v1/sources/src_gdrive_001 \
-H "Authorization: Bearer $TALONIC_API_KEY" \
-H "Content-Type: application/json" \
-d '{ "batch_processing": true }'
# All future documents ingested from this source will use
# batch processing mode automatically.
# Stage 1 (OCR + classify) still runs immediately.The two-stage architecture of batch processing provides an elegant balance between immediate feedback and cost optimization. Stage 1 processes immediately — within seconds, you know what type of document was uploaded, its classification, triage metadata, and it appears in your library. Only Stage 2 (the expensive LLM extraction step) is deferred. This means you can build workflows that react to document arrival (routing rules, notifications, triage) without waiting for batch results, while still saving 50% on the extraction cost. Documents show a clear batch_queued status in the library so you always know which documents are waiting for extraction results.