Extract data from Form 1099-MISC
Form 1099-MISC is the IRS information return that US businesses file to report miscellaneous payments to non-employees: rents paid to a landlord, royalties paid to an author, fishing-boat proceeds, attorney fees, prizes and awards, and other categories that do not fit the wage statement on Form W-2 or the nonemployee compensation form (1099-NEC, which was split out from 1099-MISC starting tax year 2020). Every payer issues a 1099-MISC to the IRS and to the recipient by January 31 of the year following payment, with state copies due on schedules that vary by state. Accounts payable, tax compliance, and payroll teams reconcile the 1099-MISC pile against vendor ledgers, address-validated TINs, and prior-year filings; the volume in a single mid-market accounts payable shop runs from 50 to several thousand forms a year. The hard parts are box-by-box. The 2024 form has 14 numbered boxes plus payer and recipient identification blocks. Box 1 captures rents. Box 2 royalties. Box 3 other income. Box 4 federal income tax withheld. Box 5 fishing boat proceeds. Box 6 medical and health care payments. Box 7 is no longer used for nonemployee compensation (that moved to 1099-NEC) but may still be checked as a direct sales indicator. Box 8 covers substitute payments in lieu of dividends. Box 14 covers gross proceeds paid to an attorney. Beyond the boxes themselves, payer TINs, recipient TINs (SSN, EIN, or ITIN), and recipient address blocks all have to be captured cleanly so that filings reconcile against IRS records. Talonic extracts every numbered box, both identification blocks, and any state-copy filing fields when present. The schema mirrors the IRS layout so a downstream e-filing platform can route the data without remapping. Per-cell confidence and pixel-region provenance let the tax preparer audit any number against the source 1099 before transmitting the file to the IRS or the recipient.
What gets extracted from Form 1099-MISC
How extraction works for Form 1099-MISC
Form 1099-MISC layouts are stable year over year but vary in their PDF rendering. Talonic classifies each 1099-MISC by tax year (the IRS revises box numbering periodically; the 2024 revision differs from pre-2020 layouts) and routes it through the tax-form schema in the Field Registry, which encodes each numbered box plus the identification blocks. TINs are validated against the IRS format conventions (9-digit EIN with dash, 9-digit SSN with dashes) and partially masked TINs are detected so recipient privacy is preserved downstream. Box values default to $0.00 when the source form shows the field blank, which the IRS treats as zero rather than missing. Per-cell confidence and pixel-region provenance follow DIN SPEC 91491 conformity, so a tax preparer can verify any reported amount against the source 1099 before transmitting the e-filing batch.
Sample extraction
A 2024 IRS Form 1099-MISC reporting rents and other income
{
"tax_year": 2024,
"payer": {
"name": "Acme Industries Inc.",
"address": "12 Industry Pkwy, Boston, MA 02110",
"tin": "12-3456789"
},
"recipient": {
"name": "Jane Q. Smith",
"address": "1424 Maple St, Cambridge, MA 02139",
"tin": "XXX-XX-1234"
},
"box_1_rents": 24000,
"box_2_royalties": 0,
"box_3_other_income": 1200,
"box_4_federal_tax_withheld": 0,
"box_5_fishing_boat_proceeds": 0,
"box_6_medical_health_payments": 0,
"box_7_direct_sales_indicator": false,
"box_14_gross_proceeds_attorney": 0,
"state_filings": [
{
"state": "MA",
"state_tax_withheld": 0
}
]
}Frequently asked
What is the difference between 1099-MISC and 1099-NEC?
Starting tax year 2020, the IRS split nonemployee compensation out of 1099-MISC Box 7 into a separate form, Form 1099-NEC. 1099-MISC now reports rents, royalties, other income, fishing-boat proceeds, attorney gross proceeds, and similar categories. The two forms have different filing deadlines (1099-NEC is due to recipients and the IRS by January 31; 1099-MISC paper copies have a later IRS deadline).
Are masked TINs preserved as masked, or expanded?
Talonic preserves whatever the source 1099 shows. The IRS allows payers to truncate the recipient TIN on the recipient copy (showing XXX-XX-1234 instead of the full SSN); the IRS copy carries the full TIN. The extraction reflects the source faithfully.
Can it handle handwritten or scanned 1099s?
Yes, though confidence drops for handwriting. Per-cell confidence flags low-quality OCR rows so tax preparers can verify against the source. Most 1099-MISCs are generated by accounting software and arrive as digital PDFs, which extract at very high confidence.
Does it pull state-copy filing information?
Yes, when the source PDF includes the state-filing fields (state name, state ID number, state tax withheld). Some 1099 generators emit a separate state copy with these fields populated; others bundle them into the federal form. Talonic captures whichever layout is present.
Is the output ready for direct e-filing transmission?
The structured output maps to the IRS FIRE / IRIS box-numbered fields and to most third-party e-filing platforms (Track1099, Tax1099, Yearli). Field mapping into a specific platform happens downstream of extraction; Talonic provides the structured data.
Ready to extract from your own Form 1099-MISC?
Author note
Reviewed by Talonic engineering, tax-form subject-matter review · last reviewed 2026-05-15