Skip to main content

Ground Truth

Manually-created reference datasets with known-correct values. Create from Validation → Golden Samples. Benchmark runs compare extraction results against golden samples for per-field accuracy scoring with AI judge verdicts.