Matching Configurations

A matching configuration defines how extracted document data is compared against a reference dataset. It specifies which extracted fields map to which reference columns, which matching strategy (exact, fuzzy, date_range, or numeric_range) each pair uses, and the relative weight that determines how much each comparison contributes to the overall confidence score. An auto-accept threshold (default 0.85) controls which matches are accepted without review.

Matching has two modes. Reconciliation is the mode the Matching nav item opens in the app; the nav item is gated behind advanced mode, Talonic staff membership, and dev builds, so it does not appear in a normal workspace. It reconciles document values against reference data. The weighted matching mode — the matching-config / run / result workflow documented here — lives in the public API (/v1/matching/*) and an advanced off-nav surface (/assemble/matching).

Matching strategies

Parameter	Type	Description
exact	strategy	Case-insensitive exact string match. Weight determines contribution to overall score.
fuzzy	strategy	Token-based fuzzy matching with configurable similarity threshold.
date_range	strategy	Matches dates within a configurable tolerance window (e.g. +/- 7 days).
numeric_range	strategy	Matches numbers within a configurable percentage or absolute tolerance.

exact — case-insensitive string comparison. Best for unique identifiers like PO numbers, invoice IDs, and reference codes where values should match verbatim.
fuzzy — token-based similarity with a configurable threshold. Handles misspellings, abbreviations, and word reordering. Ideal for company names, addresses, and descriptions.
date_range — matches dates within a configurable tolerance window (e.g., +/- 7 days). Useful when documents report dates with slight offsets, such as invoice date vs. received date.
numeric_range — matches numbers within a percentage or absolute tolerance. Handles rounding differences in amounts, quantities, and prices across systems.

Weights and setup workflow

Each field comparison carries a weight that determines how much it contributes to the overall confidence score. Set high weights on fields that are strong identifiers (like reference numbers or unique IDs) and lower weights on fields that are common or prone to variation (like names or descriptions). Weights are relative: the weighted aggregate produces a final confidence score between 0 and 1, displayed as a percentage in the app.

You can also use AI strategy generation to let the platform suggest a matching strategy automatically. It analyzes the reference data shape and your target scope (a run, a schema, or a document filter), then synthesizes a draft strategy with field rules, blocking keys, and thresholds that you review and adjust before executing. Most teams start with AI strategy generation and fine-tune based on initial results: a common pattern is a high-weight exact match on a unique identifier (like a PO number) combined with lower-weighted fuzzy matches on name and description fields as supporting evidence.

Create a matching configuration

curl -X POST https://api.talonic.com/v1/matching/configs \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Invoice to PO Matching",
    "reference_data_id": "b8c9d0e1-…",
    "threshold": 0.85,
    "field_mappings": [
      { "extracted_field": "vendor_name", "reference_column": "vendor_name", "match_type": "fuzzy", "weight": 0.4 },
      { "extracted_field": "po_number", "reference_column": "po_number", "match_type": "exact", "weight": 0.35 },
      { "extracted_field": "total_amount", "reference_column": "amount", "match_type": "numeric_range", "weight": 0.15 },
      { "extracted_field": "invoice_date", "reference_column": "po_date", "match_type": "date_range", "weight": 0.1 }
    ]
  }'

# threshold = auto-accept confidence threshold (defaults to 0.85)

Generate an AI matching strategy

curl -X POST https://api.talonic.com/v1/matching/strategies/generate \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "reference_data_id": "b8c9d0e1-…",
    "target_type": "run",
    "target_value": { "run_id": "d4e5f6a7-…" },
    "user_prompt": "Match invoices to the vendor registry; PO number is authoritative."
  }'

# Response (201): the generated strategy entity, including field_rules,
# blocking_keys, thresholds, and a reasoning_summary. Review it (or PATCH
# /v1/matching/strategies/{id} to adjust), then execute it with
# POST /v1/matching/configs/{id}/smart-run.

Frequently asked questions

What matching strategies are available?+

Four strategies: exact (case-insensitive string match), fuzzy (token-based with similarity threshold), date_range (configurable tolerance), and numeric_range (percentage or absolute tolerance).

Can Talonic suggest matching configurations?+

Yes. AI strategy generation (POST /v1/matching/strategies/generate) analyzes your reference data and target scope, then synthesizes a draft strategy with field rules, blocking keys, and thresholds. You review the strategy and execute it with a smart run.

How do weights affect matching scores?+

Each field comparison carries a weight that determines its contribution to the overall confidence score. Fields with higher weights have more influence on the final score. Weights are relative; the weighted aggregate produces a confidence score between 0 and 1.

What is the difference between fuzzy and exact matching?+

Exact matching requires an identical string (case-insensitive). Fuzzy matching uses token-based comparison with a configurable similarity threshold, making it suitable for fields with minor variations like misspellings, abbreviations, or word reordering.

How should I set weights for my matching fields?+

Assign high weights (0.3-0.5) to strong identifiers like reference numbers or unique IDs, and lower weights (0.1-0.2) to supporting fields like names, dates, and amounts. A common starting pattern is one high-weight exact match on a unique identifier plus two or three lower-weight fuzzy or range matches on supporting fields.

What does the threshold on a matching configuration do?+

The threshold is the auto-accept confidence level, defaulting to 0.85. Results scoring at or above it are accepted automatically; results below it land in the review band for human approval or AI resolution.

Reference Data

Running Matches

Match Results

Matching Configurations

Matching strategies

Matching strategies

Weights and setup workflow

Frequently asked questions

Related