What fields are extracted from bank statements?

Talonic returns bank statements as schema-validated, typed fields. Common fields include Bank Name, Account Holder, Account Number, Statement Period, and more, each normalized (dates to ISO 8601, amounts as numbers) and mapped to a stable key so the output shape stays the same across layouts.

How accurate is extraction from bank statements, and how is confidence reported?

Every extracted cell carries a confidence score from 0.0 to 1.0 and a provenance pointer back to the source page and region, so low-confidence values can be reviewed against the original before the data is trusted downstream. There is no single accuracy number: confidence is reported per field so you can gate on it.

Can I use bank statements extraction in production?

Yes. The same engine behind this guide is available as a production REST API and Node SDK with sync, async, and streaming modes, schema versioning, signed webhooks, and EU-resident processing. Start free with an API key, then scale on usage-based pricing.

What does it cost to extract data from bank statements?

There is a free tier for prototyping and agent evaluation with no credit card. Paid usage is credit-based at 1,000 credits per euro: page ingestion is 100 credits per page and registry-resolved queries are free. See talonic.com/pricing for current tiers.

Extract data from bank statements

Bank statements are the lingua franca of financial reconciliation. Every bank ships them differently: multi-page PDFs from Chase, two-column scans from a community credit union, e-statements from JPMorgan with header noise on every page, statements with redacted account numbers, statements that mix USD, EUR, and GBP transactions across one accounting period. The shape changes constantly while the underlying data does not. Accounts payable teams reconciling vendor payments, lenders running cash-flow analysis on small-business applicants, accountants closing the books on March 31, mortgage underwriters verifying income against W-2 wages, and bookkeepers categorizing ACH credits, wire transfers, and merchant settlements all need the same thing: every line of the statement as a structured row with a date, a description, a signed amount, and a running balance, plus the account metadata captured exactly once at the statement level. The hard parts are usually invisible until you try to extract at scale. Banks change their layouts without notice. Statement periods cross months, so the opening and closing balances anchor a window that has to tie out. Debit and credit conventions differ. Some statements present withdrawals as negative numbers in a signed column, others as a separate Debits column with the sign implied. Running balances may or may not appear per row. Scanned statements lose alignment between the Date, Description, and Amount columns. Multi-currency accounts mix three or four ISO 4217 currencies on the same page. Page headers and footers, including the bank logo, the statement period, the cycle date, and the disclaimer, repeat on every page and have to be filtered without losing the actual data. Talonic processes any bank statement against a schema designed for these realities. Every transaction becomes a row with a normalized date in ISO 8601 form, a description, an amount in the canonical sign convention (debits negative, credits positive), and a running balance where present. Account metadata, including bank name, account holder, account number, statement start and end dates, opening balance, and closing balance, is captured once at the statement level, not duplicated per row. Every extracted cell carries a confidence score and a pixel-region reference back to the source PDF so any number can be audited in seconds.

Open the Bank Statement to CSV toolOpen tool →Get an API key to run at scaleGo to platform →

What gets extracted from bank statements

Bank NameWells FargoIssuer of the statement

Account HolderAcme CorporationName on the account

Account Number****1234Often partially masked

Statement Period2026-04-01 to 2026-04-30

Opening Balance$12,480.55

Closing Balance$18,902.11

Transaction Date2026-04-12Per row

Transaction DescriptionACH CREDIT, STRIPE PAYOUTPer row

Transaction Amount+$2,450.00Sign-corrected per row

Running Balance$14,930.55Per row, where present

How extraction works for bank statements

Bank statements arrive in dozens of layouts even within a single bank, so templates fail almost immediately at scale. Talonic classifies each statement and runs it through the Bank Statement schema in the Field Registry without per-bank configuration. Page headers and footers are filtered so they do not appear as transaction rows. The sign convention is normalized: withdrawals are negative, deposits are positive, regardless of whether the source statement uses parentheses, separate columns, or color coding. Multi-page statements are stitched into a single transaction stream with the opening and closing balances tying out against the per-row running balance. For scanned or low-resolution statements, every extracted cell is returned with a confidence score and a pixel-region pointer in line with DIN SPEC 91491 conformity, so any value below the confidence threshold can be reviewed against the source image in the dashboard.

Sample extraction

A 3-page Chase business checking statement (April 2026)

{
  "bank_name": "JPMorgan Chase Bank, N.A.",
  "account_holder": "Acme Corporation",
  "account_number": "****6431",
  "statement_start_date": "2026-04-01",
  "statement_end_date": "2026-04-30",
  "opening_balance": 12480.55,
  "closing_balance": 18902.11,
  "transactions": [
    {
      "transaction_date": "2026-04-03",
      "description": "ACH CREDIT, STRIPE PAYOUT",
      "amount": 2450,
      "running_balance": 14930.55
    },
    {
      "transaction_date": "2026-04-05",
      "description": "CHECK #2174, ABC SUPPLIES",
      "amount": -842.16,
      "running_balance": 14088.39
    },
    {
      "transaction_date": "2026-04-15",
      "description": "WIRE OUT, VENDOR PAYMENT",
      "amount": -5000,
      "running_balance": 9088.39
    }
  ]
}

Frequently asked

Does it work on scanned bank statements, or only digital PDFs?

Both. Scanned statements are OCRed and run through the same schema, with confidence scores per cell so low-confidence rows can be reviewed against the source image. Digital PDFs extract at higher confidence because the text layer is already present.

How are debits and credits handled if the statement uses two columns instead of signed amounts?

The output is always sign-normalized: withdrawals are negative, deposits are positive, in a single amount column. Two-column source layouts are merged at extraction time so downstream reconciliation does not have to handle two formats.

What happens with multi-currency accounts?

Each transaction carries its own currency code. The account-level metadata records the statement currency. Mixed-currency statements are extracted as-is with the transaction-level currency preserved; no automatic conversion is performed.

Can it stitch a multi-page statement into one transaction stream?

Yes. Multi-page statements return a single ordered transaction array. The opening balance from page 1 and the closing balance from the last page are tied out against the per-row running balance; any discrepancy raises a validation flag.

Is the output ready to import into Excel or an accounting system?

The structured output exports cleanly to CSV for spreadsheets and to JSON for ERPs, accounting platforms, and lending engines. Account metadata appears once at the top level; transactions are a flat array of rows.

Ready to extract from your own bank statements?

Open the Bank Statement to CSV toolOpen tool →Get an API key to run at scaleGo to platform →

Author note

Reviewed by Talonic engineering, schema review · last reviewed 2026-05-12

Source: Talonic Bank Statement schema (SCH-64B8AE0A)

Related extraction guides

Extract data from invoices Extract data from expense reports Extract data from Form 1099-MISC