Skip to main content

Extract data from CMS-1500 claim forms

The CMS-1500 is the paper claim a physician practice sends to a payer, and a billing office reads them constantly, whether scanned from a clearinghouse rejection or received from an out-of-network provider. Maintained by the NUCC, the form packs an entire claim into numbered boxes: the patient and insured in Boxes 1 through 11, the diagnosis pointers in Box 21 (ICD-10 codes), the rendering and billing provider identifiers including the NPI in Boxes 24J, 32, and 33, and the service lines in Box 24 where each row carries a date of service, a place-of-service code, a CPT or HCPCS procedure code with modifiers, a diagnosis pointer, a charge, and units. Difficulty lives in the Box 24 service-line grid and the linkage between diagnoses and procedures. Each of the six service lines points back to one or more of the Box 21 diagnosis codes by letter (A, B, C), so the clinical justification for a procedure has to be reconstructed from the pointer rather than the code itself. Modifiers on a CPT code (such as 25 or 59) change reimbursement and must stay attached to their procedure. Rendering provider NPIs in 24J can differ from the billing NPI in Box 33. Scanned forms skew and the tight grid shifts, so a charge can drift into the wrong column. Talonic reads the CMS-1500 by box number and returns the patient, insured, the Box 21 diagnosis list, and the Box 24 service lines with their procedure codes, modifiers, diagnosis pointers, charges, and provider NPIs. A billing team posts and scrubs claims from structured data rather than re-keying a dense grid.

What gets extracted from CMS-1500 claim forms

Patient NameHelen Park
Insured IDAET558210934
PayerAetna
Diagnosis Codes (Box 21)ICD-10 I10, E11.9
Date of Service (Box 24A)2026-04-18
Place of Service (Box 24B)11 (office)
Procedure (Box 24D)CPT 99214, modifier 25
Charge (Box 24F)$320.00
Rendering NPI (Box 24J)1396744321
Billing NPI (Box 33)1487553210

How extraction works for CMS-1500 claim forms

CMS-1500 forms reach a biller as clearinghouse PDFs, payer rejections, and scanned paper, and the NUCC layout is fixed enough to anchor on box numbers rather than pixels. Talonic reads the claim against the CMS-1500 box map in the Field Registry, which binds each value to its numbered box so a skewed scan does not drift a charge into the wrong column. Box 24 service lines are captured as a structured array, and each line keeps its CPT or HCPCS procedure code, its modifiers, its diagnosis pointer back to the Box 21 ICD-10 list, its charge, and its units. Rendering NPI in Box 24J and billing NPI in Box 33 are kept distinct. Every value returns with a confidence score and pixel-region provenance under DIN SPEC 91491 conformity, so a billing team can verify a service line against the source claim before submitting.

Sample extraction

A scanned CMS-1500 with two service lines

{
  "patient_name": "Helen Park",
  "insured_id": "AET558210934",
  "payer_name": "Aetna",
  "diagnosis_codes": [
    "I10",
    "E11.9"
  ],
  "service_lines": [
    {
      "date_of_service": "2026-04-18",
      "place_of_service": "11",
      "procedure_code": "99214",
      "modifiers": [
        "25"
      ],
      "diagnosis_pointer": "A",
      "charge": 320,
      "units": 1,
      "rendering_npi": "1396744321"
    }
  ],
  "billing_provider_npi": "1487553210",
  "total_charge": 320
}

Frequently asked

Does it preserve the diagnosis-to-procedure linkage?

Each Box 24 service line keeps its diagnosis pointer (A, B, C) back to the Box 21 ICD-10 list, so the clinical justification for each procedure is reconstructed rather than lost, which is what a payer scrubs for.

Are CPT modifiers captured per line?

Modifiers such as 25 or 59 stay attached to their procedure code on the service line, because they change reimbursement and a claim posted without them will pay incorrectly.

Does it distinguish the rendering and billing NPIs?

Rendering and billing provider NPIs (Box 24J and Box 33) are returned as separate fields, since they frequently differ within a group practice.

Ready to extract from your own CMS-1500 claim forms?

Author note

Reviewed by Talonic engineering, schema review · last reviewed 2026-06-14