Dialects

Dialects define the output format for structured data. They control how values are serialized when delivered or exported — everything from date formatting and number locale to CSV delimiters and character encoding. A dialect can be shared across schemas or defined inline for a specific schema. Configure dialects in the Schema → Delivery tab. Shared dialects ensure consistent formatting across all your exports without duplicating configuration on every schema.

Dialect settings

Parameter	Type	Description
date_format	string	Date output format, e.g. DD-MM-YYYY, YYYY/MM/DD, MM.DD.YYYY.
number_locale	locale	Number formatting locale, e.g. fr-FR (1 234,56), en-US (1,234.56).
delimiter	char	CSV column delimiter. Default comma; use semicolon (;) for European locales.
null_representation	string	How null/empty values are serialized: empty string, "N/A", "null", etc.
boolean_format	array	Two-element array of [true_value, false_value], e.g. ["true", "false"], ["1", "0"], ["yes", "no"].
encoding	string	Output file encoding: UTF-8 (default), UTF-8-BOM, ISO-8859-1, etc.

For example, to configure date formatting for a European accounting system: set date_format to DD.MM.YYYY so dates render as 15.03.2025 instead of the default YYYY/MM/DD. Pair this with number_locale: "de-DE" for comma-decimal formatting (1.234,56) and delimiter: ";" so CSV files open correctly in Excel on European locale machines. Save this configuration as a shared dialect named "EU Accounting" and attach it to every schema that feeds into that system — all future exports and deliveries will use consistent formatting without per-schema configuration.

When working with international data, configure the dialect to match your downstream system requirements. For example, set number_locale to fr-FR for European comma-decimal formatting, switch the delimiter to semicolon for CSV compatibility, and choose UTF-8-BOM encoding if your data will be opened in Excel. Creating a shared dialect and reusing it across schemas ensures consistent formatting across all your exports.

Dialect settings are applied during Phase 4 of the extraction pipeline and during CSV/XLSX export. The dialect does not affect how values are stored internally — it only controls the serialization format when data leaves the platform. This means you can change a dialect at any time without re-running extractions; the new format applies to all future exports and deliveries.

For best results, create a shared dialect for each downstream system or regional office you deliver to, and name it descriptively (e.g., "SAP Europe" or "US Accounting"). Avoid defining dialects inline on individual schemas unless you have a one-off formatting requirement. Shared dialects reduce maintenance burden and ensure consistency when you add new schemas later.

Create a shared dialect via API

curl -X POST https://api.talonic.com/v1/dialects \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "EU Accounting",
    "date_format": "DD.MM.YYYY",
    "number_locale": "de-DE",
    "delimiter": ";",
    "null_representation": "",
    "boolean_format": ["yes", "no"],
    "encoding": "UTF-8-BOM"
  }'

# Response (201):
# {
#   "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
#   "name": "EU Accounting",
#   "version": 1,
#   "config": {
#     "date_format": "DD.MM.YYYY",
#     "number_locale": "de-DE",
#     "delimiter": ";",
#     "boolean_format": ["yes", "no"],
#     "encoding": "UTF-8-BOM"
#   },
#   "created_at": "2026-04-25T14:30:00.000Z"
# }

List all dialects in the workspace

curl -s https://api.talonic.com/v1/dialects \
  -H "Authorization: Bearer $TALONIC_API_KEY"

# Response:
# {
#   "data": [
#     { "id": "a1b2c3d4-...", "name": "EU Accounting", "version": 1, "config": { "date_format": "DD.MM.YYYY" } },
#     { "id": "b2c3d4e5-...", "name": "US Standard", "version": 2, "config": { "date_format": "MM/DD/YYYY" } }
#   ]
# }

Dialects can be managed programmatically through the full CRUD API: create with POST /v1/dialects, retrieve with GET /v1/dialects/{id}, update with PUT /v1/dialects/{id} (a partial update where only the keys present on the body are patched, and each update bumps the stored version), and remove with DELETE /v1/dialects/{id}. This is useful for teams that manage multiple workspaces and want to synchronize formatting conventions across environments: export a dialect configuration from one workspace and replicate the JSON body in another.

If your CSV files show garbled special characters (accents, umlauts, CJK text), switch the encoding to UTF-8-BOM. The BOM (byte order mark) tells Excel to interpret the file as UTF-8 instead of the system default encoding.

Frequently asked questions

What are dialects in Talonic?+

Dialects define the output format for structured data, controlling date format, number locale, CSV delimiter, null representation, boolean format, and encoding.

Can I share a dialect across multiple schemas?+

Yes. A dialect can be shared across schemas or defined inline for a specific schema. Configure them in the Schema > Delivery tab.

Do I need to re-run extractions when I change a dialect?+

No. Dialects only affect output serialization (exports and deliveries), not how values are stored internally. Changing a dialect takes effect immediately on future exports without re-processing.

How do I apply a shared dialect to a specific schema?+

Navigate to the schema editor, open the Delivery tab, and select the shared dialect from the dropdown. The schema's delivery configuration is also readable and writable over the API via GET and POST /v1/schemas/{id}/delivery. The dialect applies to all future exports and deliveries for that schema without re-running extractions.

Schema Features Reference

Delivery Pipeline

Shared Dialects

Dialects

Dialect settings

Frequently asked questions

Related