Skip to main content

Dialects

Dialects define the output format for structured data. They control how values are serialized when delivered or exported — everything from date formatting and number locale to CSV delimiters and character encoding. A dialect can be shared across schemas or defined inline for a specific schema. Configure dialects in the Schema → Delivery tab. Shared dialects ensure consistent formatting across all your exports without duplicating configuration on every schema.

Dialect settings

ParameterTypeDescription
date_formatstringDate output format, e.g. DD-MM-YYYY, YYYY/MM/DD, MM.DD.YYYY.
number_localelocaleNumber formatting locale, e.g. fr-FR (1 234,56), en-US (1,234.56).
delimitercharCSV column delimiter. Default comma; use semicolon (;) for European locales.
null_representationstringHow null/empty values are serialized: empty string, "N/A", "null", etc.
boolean_formatstringBoolean output: true/false, 1/0, yes/no, Y/N.
encodingstringOutput file encoding: UTF-8 (default), UTF-8-BOM, ISO-8859-1, etc.

For example, to configure date formatting for a European accounting system: set date_format to DD.MM.YYYY so dates render as 15.03.2025 instead of the default YYYY/MM/DD. Pair this with number_locale: "de-DE" for comma-decimal formatting (1.234,56) and delimiter: ";" so CSV files open correctly in Excel on European locale machines. Save this configuration as a shared dialect named "EU Accounting" and attach it to every schema that feeds into that system — all future exports and deliveries will use consistent formatting without per-schema configuration.

When working with international data, configure the dialect to match your downstream system requirements. For example, set number_locale to fr-FR for European comma-decimal formatting, switch the delimiter to semicolon for CSV compatibility, and choose UTF-8-BOM encoding if your data will be opened in Excel. Creating a shared dialect and reusing it across schemas ensures consistent formatting across all your exports.

Dialect settings are applied during Phase 4 of the extraction pipeline and during CSV/XLSX export. The dialect does not affect how values are stored internally — it only controls the serialization format when data leaves the platform. This means you can change a dialect at any time without re-running extractions; the new format applies to all future exports and deliveries.

For best results, create a shared dialect for each downstream system or regional office you deliver to, and name it descriptively (e.g., "SAP Europe" or "US Accounting"). Avoid defining dialects inline on individual schemas unless you have a one-off formatting requirement. Shared dialects reduce maintenance burden and ensure consistency when you add new schemas later.

Create a shared dialect via API
curl -X POST https://api.talonic.com/v1/dialects \
  -H "Authorization: Bearer $TALONIC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "EU Accounting",
    "date_format": "DD.MM.YYYY",
    "number_locale": "de-DE",
    "delimiter": ";",
    "null_representation": "",
    "boolean_format": "yes/no",
    "encoding": "UTF-8-BOM"
  }'

# Response:
# {
#   "id": "dial_eu_001",
#   "name": "EU Accounting",
#   "created_at": "2025-04-18T12:00:00Z"
# }
List all dialects in the workspace
curl -s https://api.talonic.com/v1/dialects \
  -H "Authorization: Bearer $TALONIC_API_KEY"

# Response:
# {
#   "dialects": [
#     { "id": "dial_eu_001", "name": "EU Accounting", "date_format": "DD.MM.YYYY" },
#     { "id": "dial_us_002", "name": "US Standard", "date_format": "MM/DD/YYYY" }
#   ]
# }

Dialects can be managed programmatically through the full CRUD API: create with POST, retrieve with GET, update with PUT, and delete with DELETE on the /v1/dialects endpoints. This is useful for teams that manage multiple workspaces and want to synchronize formatting conventions across environments. You can export a dialect configuration from one workspace and import it into another by replicating the JSON body, ensuring consistent output formatting across your entire organization.

If your CSV files show garbled special characters (accents, umlauts, CJK text), switch the encoding to UTF-8-BOM. The BOM (byte order mark) tells Excel to interpret the file as UTF-8 instead of the system default encoding.

Frequently asked questions

What are dialects in Talonic?+
Dialects define the output format for structured data, controlling date format, number locale, CSV delimiter, null representation, boolean format, and encoding.
Can I share a dialect across multiple schemas?+
Yes. A dialect can be shared across schemas or defined inline for a specific schema. Configure them in the Schema > Delivery tab.
Do I need to re-run extractions when I change a dialect?+
No. Dialects only affect output serialization (exports and deliveries), not how values are stored internally. Changing a dialect takes effect immediately on future exports without re-processing.
How do I apply a shared dialect to a specific schema?+
Navigate to the schema editor, open the Delivery tab, and select the shared dialect from the dropdown. Alternatively, use the PATCH /v1/schemas/{id} endpoint with a dialect_id field to link the dialect programmatically. The dialect applies to all future exports and deliveries for that schema without re-running extractions.