Extract data from UB-04 claim forms
When a hospital bills an inpatient stay or an outpatient facility encounter, it uses the UB-04, the institutional claim that a professional CMS-1500 cannot represent. A facility billing office and the payers reviewing those claims work from a form, maintained by the National Uniform Billing Committee, that is organized around numbered form locators (FL) rather than the CMS-1500 boxes. The fields that drive an institutional claim are the provider and patient in the early locators, the type of bill code in FL 4, the revenue codes and charges in FL 42 through 47, the diagnosis and procedure codes (ICD-10) in FL 66 through 74, and the payer and insured information in FL 50 through 58. The complexity is the revenue-code line grid and the institutional coding. FL 42 carries a list of revenue codes (such as 0450 for emergency room or 0636 for drugs requiring detail), each with a description, a HCPCS code where applicable, units, and a charge, and the lines sum to a total in revenue code 0001. The type-of-bill code in FL 4 is a three-digit string where each digit means something specific about facility type and the bill sequence. Condition, occurrence, and value codes in FL 18 through 41 modify the claim in ways a payer adjudicates against. A scanned UB-04 is dense, and the revenue-code lines are easy to misalign. Talonic reads the UB-04 by form locator and returns the facility, the type-of-bill code, the revenue-code lines with their charges and units, the diagnosis and procedure codes, and the payer detail. A hospital billing team submits and reviews institutional claims from structured data instead of a crowded grid.
What gets extracted from UB-04 claim forms
How extraction works for UB-04 claim forms
UB-04 forms reach a facility biller as clearinghouse PDFs, payer correspondence, and scanned paper, and the NUBC layout is anchored on numbered form locators. Talonic reads the claim against the UB-04 form-locator map in the Field Registry, which binds each value to its FL so the dense grid does not misalign on a skewed scan. The FL 42 revenue-code lines are captured as a structured array, each with its revenue code, description, HCPCS code, units, and charge, and the lines are reconciled against the total in revenue code 0001. The three-digit type-of-bill code in FL 4 is decoded into facility type and bill sequence. Condition, occurrence, and value codes are kept as coded lists. Every value returns with a confidence score and pixel-region provenance under DIN SPEC 91491 conformity, so a hospital billing team can verify a revenue line against the source claim.
Sample extraction
A UB-04 institutional claim with multiple revenue lines
{
"facility_name": "Riverside Regional Hospital",
"patient_control_number": "PCN-2026-558102",
"type_of_bill": "111",
"revenue_lines": [
{
"revenue_code": "0450",
"description": "Emergency room",
"hcpcs": "99285",
"units": 1,
"charge": 1840
},
{
"revenue_code": "0636",
"description": "Drugs requiring detail",
"hcpcs": "J1885",
"units": 2,
"charge": 96
}
],
"total_charge": 1936,
"diagnosis_codes": [
"J18.9"
],
"payer_name": "UnitedHealthcare",
"facility_npi": "1396744321"
}Frequently asked
Does it capture the revenue-code line grid?
Yes. Each FL 42 line is returned with its revenue code, description, HCPCS code, units, and charge, and the lines are reconciled against the total carried in revenue code 0001, so an out-of-balance claim is flagged.
How is the type-of-bill code handled?
The three-digit FL 4 code is decoded into its facility type and bill-sequence meaning rather than returned as an opaque string, since each digit drives how the payer adjudicates the claim.
How does the UB-04 differ from the CMS-1500 here?
The UB-04 is institutional and organized around revenue codes and form locators, while the CMS-1500 is professional and organized around CPT service lines. Talonic maps each to its own schema rather than forcing one layout onto the other.
Ready to extract from your own UB-04 claim forms?
Author note
Reviewed by Talonic engineering, schema review · last reviewed 2026-06-09