Entity Graph
Get the tenant entity relationship graph of distinct extracted values, the documents they occur in, and the field names they attach to. Force a deterministic rebuild on demand.
The entity graph is a deterministic view of the distinct values extracted across your documents and the structure that connects them. An entity is a distinct extracted value (trimmed, lower-cased, whitespace-collapsed). Each entity links to the documents it occurs in and the field names it was extracted from. No LLM and no named-entity recognition are involved: entities are extracted values, typed coarsely from the occurrence data type.
This is a different lens than link keys and cases. Where link keys are the curated fields used to discover case membership, the entity graph is the raw value-to-document fabric underneath. It powers value-centric exploration: pick a vendor name or a reference number and see every document it appears in, regardless of whether those documents form a case.
The graph has two edge kinds, mirroring its tripartite shape. An occurs_in edge connects an entity to a document it appears in. An attached_to edge connects an entity to a field:<name> node representing the field it was extracted from. Entity-to-entity co-occurrence is derived by the consumer, not stored, which keeps the persisted graph compact.
The graph is cached per workspace and rebuilt lazily. When new documents resolve, the snapshot is flagged stale and the next GET /v1/linking/entity-graph rebuilds it before serving. This avoids an expensive rebuild during an ingestion burst while keeping reads fresh. To force an immediate rebuild that bypasses the stale flag, call POST /v1/linking/entity-graph/recompute.
type is a coarse mapping from the occurrence data type (e.g. org becomes ORG); unmapped values are typed OTHER./v1/linking/entity-graphResponse
Response fields
Response
{
"entities": [
{
"id": "acme corp",
"value": "Acme Corp",
"type": "ORG",
"doc_count": 3,
"occurrences": 4,
"field_names": ["vendor_name", "supplier"]
}
],
"documents": [
{
"id": "doc_uuid_1",
"filename": "invoice_oct.pdf",
"doc_type": "Invoice",
"date": "2024-10-01",
"mtime": 1727740800000
}
],
"field_names": [
{ "id": "field:vendor_name", "name": "vendor_name", "value_count": 3 }
],
"edges": [
{ "source": "acme corp", "target": "doc_uuid_1", "kind": "occurs_in" },
{ "source": "acme corp", "target": "field:vendor_name", "kind": "attached_to" }
],
"stats": {
"n_entities": 1,
"n_documents": 1,
"n_edges": 2,
"built_ms": 84,
"source": "field_occurrences",
"ner": false
}
}Recompute Entity Graph
Force a full rebuild of the entity graph, bypassing the stale flag. Unlike the read endpoint, which rebuilds only when the snapshot is dirty, recompute always recomputes from the current field_occurrences and persists the new snapshot. The response returns only the stats block, so use it to confirm graph size after a large ingestion, then read the full graph from the GET endpoint.
/v1/linking/entity-graph/recomputeResponse fields
Response
{
"ok": true,
"stats": {
"n_entities": 412,
"n_documents": 87,
"n_edges": 1903,
"built_ms": 612,
"source": "field_occurrences",
"ner": false
}
}Errors
Error responses