Usage Overview
Track AI token consumption and cost estimates across your workspace with aggregate usage stats and per-document breakdowns via the Usage API.
The Usage API provides visibility into AI token consumption and cost estimates across your workspace. Every AI operation — extraction, classification, matching, resolution — is metered and logged with input/output token counts, the model used, and an estimated USD cost. Use these endpoints to monitor spending, optimize pipeline configuration, and audit AI operations.
Two endpoints cover aggregate and document-level views. The aggregate endpoint returns workspace-wide totals with a breakdown by operation type and model over a configurable date range. The per-document endpoint drills into a specific document to show every AI call made during its processing lifecycle, including cache token utilization and per-operation cost estimates.
Cost estimates are computed from the model-specific token pricing at the time of the API call. Input tokens, output tokens, and cache read tokens are each priced at their respective rates. Cache read tokens represent prompt cache hits where previously cached input was reused at a significantly lower rate than fresh input tokens — high cache utilization indicates efficient prompt reuse across similar documents.
Usage data is available immediately after each AI operation completes. There is no delay or batching — the token counts and cost estimates are recorded synchronously as part of the operation lifecycle. Historical data is retained indefinitely and can be queried over any date range.
/v1/usageAggregate Response
Response fields
Response — Aggregate usage
{
"period": { "from": "2026-04-14T00:00:00.000Z", "to": "2026-05-14T00:00:00.000Z" },
"totals": { "input_tokens": 8940120, "output_tokens": 1274580, "calls": 2764 },
"breakdown": [
{
"operation_type": "extraction",
"model": "claude-sonnet-4-20250514",
"input_tokens": 7896420,
"output_tokens": 1127040,
"calls": 1842
},
{
"operation_type": "classification",
"model": "claude-haiku-3-5",
"input_tokens": 1043700,
"output_tokens": 147540,
"calls": 922
}
],
"links": { "self": "/v1/usage" }
}Per-Document Usage
The per-document endpoint returns every AI operation performed on a specific document along with token counts, cache utilization, and cost estimates. This is useful for understanding why a particular document was expensive to process — for example, a large multi-page PDF that required multiple extraction chunks, or a document that triggered classification retries.
/v1/usage/documents/:idDocument Response
Response fields
Response — Per-document usage
{
"document_id": "d4e5f6a7-b8c9-0123-d456-e7f8a9b0c1d2",
"totals": {
"input_tokens": 12480,
"output_tokens": 1890,
"cost_estimate_usd": 0.058,
"calls": 3
},
"entries": [
{
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"operation_type": "extraction",
"model": "claude-sonnet-4-20250514",
"input_tokens": 8640,
"output_tokens": 1245,
"cache_read_tokens": 2048,
"cost_estimate_usd": 0.041,
"created_at": "2026-05-13T14:22:10.000Z"
},
{
"id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
"operation_type": "classification",
"model": "claude-haiku-3-5",
"input_tokens": 3840,
"output_tokens": 645,
"cache_read_tokens": 0,
"cost_estimate_usd": 0.017,
"created_at": "2026-05-13T14:21:45.000Z"
}
],
"links": {
"self": "/v1/usage/documents/d4e5f6a7-b8c9-0123-d456-e7f8a9b0c1d2",
"document": "/v1/documents/d4e5f6a7-b8c9-0123-d456-e7f8a9b0c1d2"
}
}Example Request
cURL — Aggregate usage for the last 7 days
curl -X GET "https://api.talonic.ai/v1/usage?from=2026-05-07T00:00:00Z&to=2026-05-14T00:00:00Z" \
-H "Authorization: Bearer tlnc_your_api_key"Errors
Error responses