Skip to main content

Retries & Backoff

The SDK retries automatically on 429 (respecting X-RateLimit-Reset), 500, 502, 503, 504, network errors, and timeouts. Backoff is exponential with jitter, capped at 16s.

The API may set retryable: false on a specific error; the SDK respects that and does not retry.

The backoff schedule starts at 1 second for the first retry, doubling on each subsequent attempt (1s, 2s, 4s, ...) up to a maximum of 16 seconds. A random jitter of up to 250ms is added to each wait to prevent thundering-herd problems when multiple clients retry simultaneously. The formula is min(1000 * 2^attempt, 16000) + random(0, 250) milliseconds.

Retry behavior in practice
import { Talonic, TalonicServerError, TalonicRateLimitError } from '@talonic/node'

// Default: 3 retries with exponential backoff
// Attempt 0: immediate request
// Attempt 1: wait ~1.0-1.25s, then retry
// Attempt 2: wait ~2.0-2.25s, then retry
// Attempt 3: wait ~4.0-4.25s, then retry
// If all fail, throws the last error

const talonic = new Talonic({
  apiKey: process.env.TALONIC_API_KEY!,
  maxRetries: 3, // default
})

try {
  await talonic.extract({ file_path: './doc.pdf', schema_id: 'sch_abc123' })
} catch (err) {
  if (err instanceof TalonicServerError) {
    // Thrown only after all 3 retries exhausted
    console.error(`Server error after retries: ${err.code} (request ${err.requestId})`)
  }
  if (err instanceof TalonicRateLimitError) {
    // Thrown after retries exhausted; check when the rate limit resets
    console.error(`Rate limited. Resets at: ${err.rateLimit.resetAt}`)
  }
}

For 429 rate-limit responses specifically, the SDK reads the X-RateLimit-Reset header and waits until the reset timestamp (plus a 100ms buffer) rather than using exponential backoff. If the reset window exceeds 60 seconds, the SDK falls back to standard backoff instead of blocking indefinitely. This means rate-limit retries are usually faster than exponential backoff because the SDK waits exactly the right amount of time rather than guessing.

Disable retries for development
// In development, disable retries for immediate feedback
const talonic = new Talonic({
  apiKey: process.env.TALONIC_API_KEY!,
  maxRetries: 0, // no retries — throw on first failure
})

// Errors are thrown immediately without any wait
try {
  await talonic.documents.list()
} catch (err) {
  // err.retryable tells you if this WOULD have been retried
  if (err instanceof TalonicError && err.retryable) {
    console.log('This error would be retried in production (maxRetries > 0)')
  }
}

Non-retryable errors are never retried regardless of maxRetries. Authentication errors (401, 403), validation errors (400, 409, 413, 422), and not-found errors (404) are thrown immediately on the first attempt. The API can also mark specific errors as non-retryable by setting retryable: false in the error response body, which the SDK respects even for status codes that would normally be retried.

Custom retry logic on top of SDK retries
import { Talonic, TalonicRateLimitError, TalonicError } from '@talonic/node'

// For critical pipelines, add your own retry layer on top
async function extractWithFallback(talonic: Talonic, params: Parameters<Talonic['extract']>[0]) {
  try {
    return await talonic.extract(params)
  } catch (err) {
    if (err instanceof TalonicRateLimitError) {
      // SDK retries are exhausted — wait for the rate limit reset and try once more
      const waitMs = err.rateLimit.resetAt.getTime() - Date.now()
      if (waitMs > 0 && waitMs < 120_000) {
        console.log(`Rate limited. Waiting ${Math.ceil(waitMs / 1000)}s for reset...`)
        await new Promise(r => setTimeout(r, waitMs + 100))
        return await talonic.extract(params)
      }
    }
    throw err
  }
}

The transport layer handles TalonicTimeoutError and TalonicNetworkError as retryable by default. Timeout errors are raised when a request exceeds the configured timeout milliseconds (default 60s). Network errors cover DNS failures, TCP resets, and unreachable hosts. Both carry status: 0 because no HTTP response was received. After all retries are exhausted, the last error is thrown with retryable: true still set, so your application code can decide whether to queue the request for later processing.

Rate limit retries respect the X-RateLimit-Reset header from the API, so the SDK waits the exact right amount of time before retrying a 429.

Frequently asked questions

Does the Talonic SDK retry failed requests?+
Yes. It retries on 429, 500, 502, 503, 504, network errors, and timeouts with exponential backoff and jitter, up to maxRetries attempts (default 3). Non-retryable errors like 401, 403, 400, and 404 are thrown immediately.
What is the maximum backoff time between retries?+
The exponential backoff caps at 16 seconds. For 429 rate-limit errors, the SDK instead waits until the server-provided reset timestamp (plus 100ms buffer), up to a maximum of 60 seconds. If the reset window exceeds 60 seconds, standard backoff is used instead.
Can the API override the SDK's retry behavior?+
Yes. If the API returns retryable: false on an error response, the SDK will not retry that request regardless of the status code or maxRetries setting.
What is the exact backoff formula?+
The wait time is min(1000 * 2^attempt, 16000) + random(0, 250) milliseconds. So: ~1s for the first retry, ~2s for the second, ~4s for the third, capping at ~16s. The random jitter of up to 250ms prevents thundering-herd problems when multiple clients retry simultaneously.
Are timeout and network errors retried?+
Yes. Both TalonicTimeoutError and TalonicNetworkError are marked retryable: true and are retried up to maxRetries times. They carry status: 0 because no HTTP response was received. They are only thrown to your code after all retry attempts are exhausted.