What fields are extracted from purchase orders?

Talonic returns purchase orders as schema-validated, typed fields. Common fields include PO Number, PO Date, Buyer, Supplier, and more, each normalized (dates to ISO 8601, amounts as numbers) and mapped to a stable key so the output shape stays the same across layouts.

How accurate is extraction from purchase orders, and how is confidence reported?

Every extracted cell carries a confidence score from 0.0 to 1.0 and a provenance pointer back to the source page and region, so low-confidence values can be reviewed against the original before the data is trusted downstream. There is no single accuracy number: confidence is reported per field so you can gate on it.

Can I use purchase orders extraction in production?

Yes. The same engine behind this guide is available as a production REST API and Node SDK with sync, async, and streaming modes, schema versioning, signed webhooks, and EU-resident processing. Start free with an API key, then scale on usage-based pricing.

What does it cost to extract data from purchase orders?

There is a free tier for prototyping and agent evaluation with no credit card. Paid usage is credit-based at 1,000 credits per euro: page ingestion is 100 credits per page and registry-resolved queries are free. See talonic.com/pricing for current tiers.

Extract data from purchase orders

Purchase orders sit at the start of every B2B procurement workflow. The PO is the formal commitment: buyer X requests these items, at these unit prices, from supplier Y, for delivery to this location, by this date, under these terms. Sourcing teams generate POs out of an ERP. Vendors receive them as PDFs in email, in EDI feeds (X12 850 in the US, EDIFACT ORDERS in Europe), or through buyer-side punchout integrations. Order desks at supplier companies then re-key the same data into their own systems to acknowledge the PO and confirm fulfillment. That re-keying is the bottleneck: a typical mid-market supplier processes 200 to 2,000 POs a month, mostly from buyers using different ERP templates, and even small mistakes (wrong SKU, transposed quantity, missed delivery date) cascade into ship-block exceptions and missed cut-offs. The hard parts live in the table. POs almost always carry line items: SKU or part number, description, ordered quantity, unit of measure (each, case, pallet, kilogram), unit price, line total, requested delivery date per line, and sometimes a ship-to address that differs from the header. Some POs include service lines (consulting hours, milestone deliverables) alongside material lines. Long-running blanket POs have release schedules that look like child line items. Currency is mostly USD for US buyers but can mix EUR or GBP for European multinationals. Header fields are usually clean: PO number, PO date, buyer name and address, supplier name and address, payment terms, Incoterms, and the buyer's authorized signature. The full structure has to land in the supplier's ERP exactly the way the buyer's ERP rendered it, or the order acknowledgement bounces. Talonic extracts the full PO structure from any source format. Line items are returned as a structured array with all the fields above, regardless of whether the source PDF uses one table per page or stitches the table across continuation pages. Multi-line addresses are parsed into structured components. Every extracted cell carries a confidence score and a pixel-region reference so the supplier's order desk can verify any field before acknowledging the PO in their ERP.

Get an API key to run at scaleOpen tool →Get an API key to run at scaleGo to platform →

What gets extracted from purchase orders

PO NumberPO-2026-01102

PO Date2026-04-05

BuyerGlobex Logistics LLC

SupplierAcme Software, Inc.

Ship ToGlobex Warehouse 3, 4421 Logistics Pkwy, Memphis, TN 38116

CurrencyUSD

Line ItemsArray of items: SKU, description, quantity, UOM, unit price, line total, delivery date

Payment TermsNet 45, FOB Origin

How extraction works for purchase orders

POs originate in buyer-side ERPs in dozens of formats: SAP Ariba, Coupa, NetSuite, Microsoft Dynamics, Oracle, and custom systems. Talonic classifies each PO and matches it against the procurement schema in the Field Registry, which maps every header and line-item field regardless of the source ERP layout. Multi-page tables are stitched. Unit-of-measure variations (each, case, pallet, KG, LB) are normalized to a canonical UOM string. Currency follows ISO 4217. Delivery dates that vary per line are preserved per line rather than collapsed to a single header date. The output is structured so it can be routed into a supplier-side order management system without re-keying, and the per-cell confidence with pixel-region provenance keeps the extraction auditable under DIN SPEC 91491 conformity.

Sample extraction

A typical B2B purchase order in USD with two line items

{
  "po_number": "PO-2026-01102",
  "po_date": "2026-04-05",
  "buyer": "Globex Logistics LLC",
  "supplier": "Acme Software, Inc.",
  "ship_to": "Globex Warehouse 3, 4421 Logistics Pkwy, Memphis, TN 38116",
  "currency": "USD",
  "line_items": [
    {
      "sku": "ASW-PRO-AN",
      "description": "Annual subscription, Pro plan",
      "quantity": 5,
      "uom": "EACH",
      "unit_price": 1200,
      "line_total": 6000,
      "delivery_date": "2026-05-15"
    },
    {
      "sku": "ASW-ONBOARD",
      "description": "Onboarding services",
      "quantity": 8,
      "uom": "HR",
      "unit_price": 250,
      "line_total": 2000,
      "delivery_date": "2026-04-30"
    }
  ],
  "totals": {
    "subtotal": 8000,
    "tax": 0,
    "total": 8000
  },
  "payment_terms": "Net 45, FOB Origin"
}

Frequently asked

Can Talonic handle POs from any buyer-side ERP?

Yes. SAP Ariba, Coupa, NetSuite, Microsoft Dynamics, Oracle, and custom systems each produce different PO layouts. The schema does not require per-template configuration. Extraction adapts to whatever the source PDF looks like.

What about EDI POs (X12 850, EDIFACT ORDERS)?

Talonic processes PDF renders of EDI POs, which is the common form when EDI is exchanged outside an integrated EDI VAN. Native EDI flat-file ingest is supported through the API but is a separate code path from PDF extraction.

How are blanket POs and call-offs handled?

Blanket POs (long-running agreements with periodic releases) are extracted as a standard PO; each release is treated as a child line item with its own quantity and delivery date. The aggregate annual quantity sits in the header notes if the source includes it.

Are line-level delivery dates preserved?

Yes. Each line item carries its own delivery_date when the source PO specifies one. If only a header-level delivery date is shown, that date is repeated on every line for downstream compatibility.

Can the output be routed directly into our order management system?

The structured JSON maps cleanly into common order management formats. Field name mapping into your specific destination system happens downstream; Talonic provides the structured PO data, your integration layer routes it.

Ready to extract from your own purchase orders?

Get an API key to run at scaleOpen tool →Get an API key to run at scaleGo to platform →

Author note

Reviewed by Talonic engineering, procurement subject-matter review · last reviewed 2026-05-14

Source: OASIS UBL 2.1 Purchase Order schema
Source: ASC X12 850 Purchase Order transaction set

Related extraction guides

Extract data from invoices Extract data from goods receipt notes Extract data from bills of lading