Extract data from Form 990 returns
A nonprofit's Form 990 is public, which makes it the document grantmakers, journalists, and researchers reach for to understand where an organization gets its money and where it goes. The return runs long, often 50 pages with schedules, and a foundation screening 200 grantees pulls the same handful of figures from each: total revenue and total expenses on the Part I summary, net assets, the program-service accomplishments in Part III, officer and key-employee compensation in Part VII, and the functional expense breakdown in Part IX that splits spending into program, management, and fundraising. The header carries the organization name, the EIN, and the tax year. The challenge is that the meaningful numbers are scattered across parts and schedules that cross-reference each other. Part I gives a summary, but the authoritative detail sits in Part VIII (revenue) and Part IX (expenses), and the two have to reconcile to the summary. Compensation in Part VII lists each officer with a reported amount, and a large filer attaches Schedule J with the detail. Program-service revenue is itemized with NTEE-style activity codes. A Form 990-EZ and a Form 990-PF for private foundations reorganize the same concepts differently, so a reader has to know which variant they are holding. Talonic reads the Form 990 and returns the header, the Part I summary totals, the Part VII compensation list, and the Part IX functional expense split as structured fields, with the variant identified. A grantmaker loads revenue, expenses, and executive pay across a portfolio of nonprofits without paging through each 50-page return.
What gets extracted from Form 990 returns
How extraction works for Form 990 returns
Form 990 returns are filed through e-file providers and published as PDFs by the IRS and aggregators, so the schedules attached differ by organization size. Talonic identifies the variant (990, 990-EZ, 990-PF) and maps the return to the nonprofit-return schema in the Field Registry, which models the Part I summary, the Part VIII revenue detail, the Part IX functional expense split, and the Part VII compensation list as linked sections rather than loose numbers. Summary totals are reconciled against the detailed parts so a Part I figure that does not match Part VIII or Part IX is flagged. Officer compensation is captured per person and linked to Schedule J when attached. Every value returns with a confidence score and pixel-region provenance under DIN SPEC 91491 conformity, so a grantmaker or analyst can verify a figure against the source return.
Sample extraction
A full IRS Form 990 with Schedule J for a mid-size nonprofit
{
"organization_name": "Cedar Ridge Community Foundation",
"ein": "98-7654321",
"tax_year": 2025,
"form_variant": "990",
"total_revenue": 8420000,
"total_expenses": 7960000,
"net_assets_end": 3110000,
"functional_expenses": {
"program": 6540000,
"management": 980000,
"fundraising": 440000
},
"officers": [
{
"name": "Executive Director",
"reported_compensation": 214000
}
]
}Frequently asked
Does it reconcile the Part I summary to the detail parts?
Yes. Part I summary totals are checked against the Part VIII revenue detail and the Part IX functional expense split, so a mismatch between the summary and the underlying parts is flagged rather than silently trusted.
How is officer compensation captured?
Each officer and key employee in Part VII is returned with their reported compensation, and when the filer attaches Schedule J the detailed breakdown is linked to the corresponding person.
Does it handle 990-EZ and 990-PF?
The variant is identified first, and the schema maps the equivalent revenue, expense, and governance concepts across the 990, 990-EZ, and the 990-PF private-foundation form.
Ready to extract from your own Form 990 returns?
Author note
Reviewed by Talonic engineering, schema review · last reviewed 2026-06-10
- Source: IRS, About Form 990