Jobs
The jobs resource handles asynchronous batch extraction across multiple documents. Create a job with a schema_id and an array of document_ids, then poll for completion or fetch results when the job finishes.
// Create a batch extraction job
const job = await talonic.jobs.create({
schema_id: 'sch_abc123',
document_ids: ['doc_001', 'doc_002', 'doc_003', 'doc_004', 'doc_005'],
name: 'Q4 Invoice Batch',
})
console.log(job.id) // 'job_xyz789'
console.log(job.status) // 'queued'
// Poll for completion
let current = await talonic.jobs.get(job.id)
while (current.status === 'queued' || current.status === 'processing') {
console.log(`Progress: ${current.completed_documents}/${current.total_documents} (${current.current_phase})`)
await new Promise(r => setTimeout(r, 5000))
current = await talonic.jobs.get(job.id)
}
console.log(`Job ${current.status}: ${current.completed_documents} completed, ${current.failed_documents} failed`)The create() method accepts CreateJobParams with schema_id (required), optional document_ids (array of document UUIDs to process), and optional name (human-readable label). When document_ids is omitted, the job processes all unprocessed documents in the workspace. The returned Job object includes id, status, progress, total_documents, completed_documents, failed_documents, current_phase, estimated_completion, and links with URLs for the job and its results.
Jobs run server-side and process documents in parallel. Use get() to poll the job status ('queued', 'processing', 'completed', 'failed', 'cancelled') and getResults() to retrieve the extraction output once complete. For long-running batches, poll on a reasonable interval such as every 5 seconds. The grid_stats block on the job object provides total_cells, filled, empty, and fill_rate for monitoring extraction quality across the batch.
// Get structured results from a completed job
const results = await talonic.jobs.getResults('job_xyz789')
for (const row of results.data) {
console.log(`${row.document_filename}: ${JSON.stringify(row.values)}`)
}
// invoice_001.pdf: { vendor_name: 'Acme', total: 1500, ... }
// invoice_002.pdf: { vendor_name: 'Globex', total: 3200, ... }
// Export to CSV or send to your database
const rows = results.data.map(r => ({
filename: r.document_filename,
document_id: r.document_id,
...r.values,
}))Use cancel() to abort a job that is still in progress. Cancelled jobs stop processing remaining documents but any extractions already completed are preserved and accessible via getResults(). The cancelled_at timestamp is set on the job object when cancellation takes effect.
// List jobs filtered by status
const runningJobs = await talonic.jobs.list({ status: 'processing', limit: 10 })
for (const j of runningJobs.data) {
console.log(`${j.id}: ${j.completed_documents}/${j.total_documents} (est. ${j.estimated_completion})`)
}
// Cancel a job — partial results are retained
const cancelled = await talonic.jobs.cancel('job_xyz789')
console.log(cancelled.status) // 'cancelled'
console.log(cancelled.cancelled_at) // '2025-06-15T14:30:00.000Z'
// Retrieve whatever completed before cancellation
const partial = await talonic.jobs.getResults('job_xyz789')
console.log(`Retrieved ${partial.data.length} results before cancellation`)The list() method accepts ListJobsParams with optional status filter, cursor-based pagination (cursor and limit), and order for sorting. The JobResults response contains a data array where each entry has document_id, document_filename, and values (the extracted field data). This flat structure makes it straightforward to aggregate results across documents for reporting or database insertion.
schema_id rather than an inline schema. Create your schema with schemas.create() first, then pass the returned ID to jobs.create().