Skip to main content

ID Dispensers

ID dispensers generate unique identifiers for each row in a data product. Configure rules to build IDs from extracted field values with a prefix, fallback chains when the primary field is empty, and resolution maps that normalize values before ID generation.

ID rule configuration

ParameterTypeDescription
Source fieldfieldThe primary field to derive the ID from. When empty, generates a prefix-less sequential ID.
Fallback chainfield[]Ordered list of alternative fields tried when the source field is empty on a row.
Resolution mapmapKey-value lookup that normalizes field values before ID generation (e.g., "ACME Corp" → "ACME").

ID rules are persisted before generating IDs. Navigate to a data product detail page and use Apply ID Rules to generate or Regenerate IDs to refresh. The generation process evaluates each row against the configured rules: it reads the source field value, applies the resolution map if one exists, prepends the prefix, and writes the resulting ID. If the source field is empty, the dispenser walks the fallback chain in order until it finds a non-empty value. If all fields in the chain are empty, a prefix-less sequential ID is assigned so no row is left without an identifier.

A typical workflow starts by choosing a high-cardinality field as the source — contract numbers, invoice IDs, or purchase order references work well because they are unique per document. Next, configure a fallback chain with one or two alternative fields (e.g., document name, then upload date) so the dispenser always has a value to work with. Finally, add a resolution map if your source data contains variant spellings of the same entity. The map normalizes these variants before they become part of the ID, preventing duplicate IDs for rows that refer to the same real-world record.

  • Source field: the primary field used to derive each row ID
  • Fallback chain: ordered list of alternative fields tried when the source is empty
  • Resolution map: key-value lookup that normalizes values before ID generation
  • Prefix: optional string prepended to every generated ID for namespacing
  • Deterministic: same rules + same data always produces the same IDs
  • Non-destructive: regenerating IDs only updates the ID column, all other values remain unchanged

Resolution maps normalize field values before they become part of the ID. For example, a resolution map can collapse "ACME Corp", "ACME Corporation", and "Acme" into a single canonical value "ACME". This prevents duplicate IDs for rows that refer to the same real-world entity under different names.

For best results, choose source fields with high uniqueness — contract numbers or invoice IDs work well, while generic fields like "status" do not. When your documents contain multiple candidate identifiers, configure a fallback chain so the dispenser always has a value to work with. Most teams use the primary reference number as the source field and the document name as the first fallback.

Configure ID dispenser rules for a data product
curl -X POST "https://api.talonic.com/v1/data-products/dp_001/id-rules" \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_field": "invoice_number",
    "prefix": "INV",
    "fallback_chain": ["document_name", "upload_date"],
    "resolution_map": {
      "ACME Corp": "ACME",
      "ACME Corporation": "ACME",
      "Acme": "ACME"
    }
  }'

# Then apply:
# POST /v1/data-products/dp_001/generate-ids
# Each row receives an ID like "INV-INV2025042" based on the source field.

ID dispensers solve a common challenge in document processing: generating stable, meaningful identifiers for output rows that can be used as primary keys in downstream databases. Unlike random UUIDs, dispenser-generated IDs are derived from your actual data — an invoice number, contract reference, or vendor name — making them human-readable and traceable. The deterministic nature of the generation means the same document always receives the same ID regardless of when or how many times you regenerate, which is critical for maintaining referential integrity with downstream systems that store these IDs as foreign keys.

ID generation is deterministic — running Regenerate IDs with the same rules and data always produces the same output. This makes ID dispensers safe to re-run without breaking downstream references.

Frequently asked questions

How do ID dispensers handle missing field values?+
When the source field is empty, the dispenser tries each field in the fallback chain in order. If all are empty, it generates a prefix-less sequential ID.
What is a resolution map?+
A resolution map is a key-value lookup that normalizes field values before ID generation. For example, it can collapse "ACME Corp" and "ACME Corporation" into "ACME" to prevent duplicate IDs for the same entity.
Can I regenerate IDs without losing data?+
Yes. Regenerating IDs only updates the ID column — all other data product values remain unchanged. The operation is deterministic, so the same rules and data always produce the same IDs.
What makes a good source field for ID generation?+
Choose fields with high cardinality — values that are unique or nearly unique per document. Invoice numbers, contract references, and purchase order IDs work well. Avoid generic fields like status or document type, which produce collisions. Configure a fallback chain with 1-2 alternative fields so the dispenser always has a value to work with.