Quick Start
Extract structured data from a document in five lines. Create a client, call extract() with a file and schema, and read the typed result.
import { Talonic } from '@talonic/node'
const talonic = new Talonic({ apiKey: process.env.TALONIC_API_KEY! })
const result = await talonic.extract({
file_path: './invoice.pdf',
schema: {
vendor_name: 'string',
invoice_number: 'string',
total_amount: 'number',
due_date: 'date',
},
})
console.log(result.data)
// { vendor_name: 'Acme Corp', invoice_number: 'INV-2024-0847', total_amount: 14250, due_date: '2024-03-15' }The result object contains the extracted data matching your schema, plus rateLimit and cost metadata. The data fields are typed according to your schema definition, so total_amount comes back as a number and due_date as a date string.
You can also pass file_url for remote files or file with filename for in-memory bytes (Blob, Buffer, or Uint8Array). For documents already uploaded to your workspace, pass document_id to skip the upload step entirely. The SDK accepts exactly one file source per call and validates this at runtime, throwing a TalonicError with code missing_file_source or multiple_file_sources if the constraint is violated.
// Extract from a remote file — no local download needed
const result = await talonic.extract({
file_url: 'https://example.com/reports/q4-2025.pdf',
schema: {
report_title: 'string',
period: 'string',
revenue: 'number',
net_income: 'number',
highlights: ['string'],
},
})
console.log(result.data.revenue) // 4250000
console.log(result.confidence?.overall) // 0.94
console.log(result.document.pages) // 12All extract() calls are async and return a Promise. The SDK handles retries, timeouts, and error mapping automatically, so you only need a single try/catch around your call for error handling. Retryable failures (429, 5xx, network errors, timeouts) are retried up to maxRetries times with exponential backoff, so transient hiccups do not require manual retry logic.
import { Talonic, TalonicAuthError, TalonicValidationError, TalonicError } from '@talonic/node'
const talonic = new Talonic({ apiKey: process.env.TALONIC_API_KEY! })
try {
const result = await talonic.extract({
file_path: './receipt.png',
schema: {
merchant: 'string',
date: 'date',
total: 'number',
items: [{ name: 'string', price: 'number' }],
},
})
console.log(`Extracted ${result.data.items?.length ?? 0} line items from ${result.document.filename}`)
console.log(`Cost: ${result.cost?.costCredits} credits, balance: ${result.cost?.balanceCredits}`)
} catch (err) {
if (err instanceof TalonicAuthError) {
console.error('Invalid API key — check TALONIC_API_KEY')
} else if (err instanceof TalonicValidationError) {
console.error(`Bad request: ${err.message} (code: ${err.code})`)
} else if (err instanceof TalonicError) {
console.error(`Talonic error: ${err.code} (status ${err.status}, request ${err.requestId})`)
}
}The extract() response includes rich metadata beyond the extracted data. The document block contains the filename, page count, file size, detected MIME type, and detected language. The optional confidence block provides an overall confidence score and per-field scores. The processing block reports duration, pages processed, and the region that handled the request. Use these fields to build quality gates and observability into your extraction pipeline.
// Use document_id + schema_id to re-extract without re-uploading
const result = await talonic.extract({
document_id: 'doc_abc123',
schema_id: 'sch_def456',
instructions: 'Focus on the indemnification and liability sections',
include_markdown: true,
})
console.log(result.data) // structured extraction
console.log(result.markdown) // raw OCR markdown (when include_markdown is true)
console.log(result.schema) // { source: 'saved', id: 'sch_def456', definition: { ... } }await. If your environment does not support it, wrap the code in an async function.