Skip to main content

How It Works

Architecture
Agent (Claude Desktop / Cursor / Cline / etc.)
  ↓ MCP protocol over stdio
Talonic MCP server (this package)
  ↓ HTTPS, Bearer auth
api.talonic.com

Each tool call is one HTTP request to the Talonic API, using your API key. The server handles auth, retries on transient failures (429, 5xx), MIME-type detection on file uploads, multipart serialisation, and structured error formatting.

The MCP server acts as a thin translation layer between the MCP protocol and the Talonic REST API. It receives tool calls from the agent over stdio (local mode) or streamable HTTP (hosted mode), validates parameters, constructs the appropriate API request, and returns the response in the MCP-standard content format the agent expects.

For file uploads, the server handles the complexity of multipart form encoding. It reads file bytes from file_data (base64), file_path (local disk), or file_url (remote URL), detects the MIME type from the filename extension, and streams the upload to api.talonic.com. This means the agent never needs to deal with multipart boundaries or content-type headers.

Error handling is designed for agent consumption. API errors are reformatted into structured messages that include the error type, a human-readable description, and actionable next steps. For example, a missing schema error tells the agent to provide a schema or schema_id, rather than returning a raw 400 status code.

Request lifecycle

What happens during a talonic_extract call
// 1. Agent sends MCP tool call over stdio/HTTP:
{
  "tool": "talonic_extract",
  "arguments": {
    "file_url": "https://example.com/invoice.pdf",
    "schema": { "type": "object", "properties": { "total": { "type": "number" } } }
  }
}

// 2. MCP server validates parameters, constructs API request:
// POST https://api.talonic.com/v1/extract
// Authorization: Bearer tlnc_...
// Content-Type: multipart/form-data (for file uploads)

// 3. API processes: download file → OCR → extract fields → validate schema

// 4. MCP server formats response as MCP content:
{
  "content": [
    {
      "type": "text",
      "text": "{ \"status\": \"complete\", \"data\": { \"total\": 1500.00 }, ... }"
    }
  ]
}

The MCP server is stateless between tool calls. It does not cache documents, schemas, or extraction results locally. Every tool call is an independent HTTP request to the Talonic API, which means the server can be restarted at any time without losing state. All persistence — documents, schemas, extraction history — lives in the Talonic cloud workspace, accessible via your API key.

Authentication is handled transparently by the MCP server. For the local npx option, the server reads the TALONIC_API_KEY environment variable at startup and attaches it as a Bearer token to every API request. For the hosted option at mcp.talonic.com, the client passes the API key in the Authorization header, and the hosted server forwards it to the Talonic API. In neither case does the API key reach the AI agent — it stays within the MCP server process boundary.

Transport modes differ between local and hosted deployments. The local npx server communicates with the MCP client over stdio (standard input/output), which is the default transport for locally-spawned MCP servers. The hosted server at mcp.talonic.com uses streamable HTTP, where the client sends HTTP requests and receives streamed responses. Both transports implement the same MCP protocol, so tool behaviour is identical regardless of transport mode.

The MCP server automatically retries on 429 (rate limit) and 5xx (server error) responses with exponential backoff. Agents do not need to implement retry logic themselves.

Frequently asked questions

How does the Talonic MCP server work?+
The MCP server communicates with AI agents over stdio using the MCP protocol. Each tool call translates to one HTTPS request to api.talonic.com with automatic auth, retries, and error formatting.
Does the MCP server handle retries automatically?+
Yes. The server retries on 429 (rate limit) and 5xx (server error) responses with exponential backoff. Agents do not need to implement retry logic.
How are errors returned to the agent?+
API errors are reformatted into structured messages with error type, description, and actionable next steps. The agent receives clear guidance instead of raw HTTP status codes.
Does the MCP server store any data locally?+
No. The server is stateless between tool calls. It does not cache documents, schemas, or results locally. All persistence lives in the Talonic cloud workspace. The server can be restarted at any time without losing state.
Is my API key exposed to the AI agent?+
No. The API key stays within the MCP server process boundary. For local setups, it is read from the TALONIC_API_KEY environment variable. For hosted setups, it is passed in the Authorization header. In neither case does the key reach the AI agent's context window.