Skip to main content

Field Harmonization

Analyze cross-schema field overlap to discover fields appearing in multiple schemas. Identifies universal fields and shows document type distribution for harmonization.

Field harmonization reveals which canonical fields appear across multiple schemas in your organization. This cross-schema analysis is essential for understanding data consistency — if invoice_number appears in 8 out of 10 schemas, it is likely a universal field that should be standardized across all document types.

The harmonization endpoint returns fields that appear in two or more schemas, sorted by schema count. Each result includes the list of document types where the field was found and an is_universal flag indicating whether the field appears in the majority of your schemas. Universal fields are strong candidates for inclusion in cross-schema reports and data products.

Harmonization data is computed from the field registry and updated automatically as new schemas are created and documents are extracted. There is no manual step required — the endpoint always reflects the current state of your field landscape.

Use this endpoint to audit field consistency before building cross-schema data products, to identify naming conflicts (same concept with different canonical names in different schemas), and to discover opportunities for schema consolidation.

Harmonization only includes fields that appear in 2 or more schemas. Fields unique to a single schema are excluded from the results.
GET/v1/fields/harmonization

Response

Response fields

dataarrayArray of harmonized field objects, sorted by schema_count descending.
data[].idstringField UUID.
data[].canonical_namestringCanonical field name.
data[].display_namestringHuman-readable display name.
data[].data_typestringInferred data type.
data[].schema_countintegerNumber of schemas containing this field.
data[].document_type_namesstring[]List of document type names where this field appears.
data[].is_universalbooleanWhether this field appears in the majority of schemas (typically >50%).

Response

{
  "data": [
    {
      "id": "f1a2b3c4-d5e6-7890-abcd-ef1234567890",
      "canonical_name": "invoice_number",
      "display_name": "Invoice Number",
      "data_type": "string",
      "schema_count": 8,
      "document_type_names": [
        "Invoice",
        "Credit Note",
        "Purchase Order",
        "Receipt",
        "Delivery Note",
        "Proforma Invoice",
        "Tax Invoice",
        "Commercial Invoice"
      ],
      "is_universal": true
    },
    {
      "id": "b3c4d5e6-f7a8-9012-cdef-456789012345",
      "canonical_name": "vendor_name",
      "display_name": "Vendor Name",
      "data_type": "string",
      "schema_count": 6,
      "document_type_names": [
        "Invoice",
        "Purchase Order",
        "Receipt",
        "Delivery Note",
        "Contract",
        "Statement of Work"
      ],
      "is_universal": true
    },
    {
      "id": "c4d5e6f7-a8b9-0123-def0-567890123456",
      "canonical_name": "payment_terms",
      "display_name": "Payment Terms",
      "data_type": "string",
      "schema_count": 3,
      "document_type_names": [
        "Invoice",
        "Purchase Order",
        "Contract"
      ],
      "is_universal": false
    }
  ]
}

cURL example

curl -H "Authorization: Bearer tlnc_..." \
  "https://api.talonic.com/v1/fields/harmonization"

Errors

Error responses

401unauthorizedMissing or invalid API key.
429rate_limitedToo many requests. Retry after the period indicated in the Retry-After header.