“Grants from over 57,000 U.S. private foundations to nearly 5,300 colleges and universities”

AEI Source:

AEI SOURCE is developed by Tao Tan.

SOURCE (Searchable Open University Records of Charitable Expenditures) is an interactive tool to access a dataset of over 1 million grants from over 57,000 U.S. private foundations to colleges and universities, predominantly American. Each row is a single grant: the funder, the recipient, the year, the dollar amount, the foundation’s own free-text description of why the grant was made, and a thematic tag added by AEI to make the corpus searchable by topic. Filters cover funder, recipient, year, tag, and geography.

Every row also carries a link to the underlying Form 990-PF filing on ProPublica, as well as the foundation’s overall profile. A reader who wants to see the original disclosure for any grant in the browser can click through to ProPublica’s XML viewer and read the raw filing as the IRS received it.

SOURCE was built so that journalists, researchers, donors, university leaders, and the general public can pull the same data and reach their own conclusions. A technical appendix is attached for any reader who wishes to reproduce or extend the dataset. Limitations to this tool are discussed below under “Caveats and limitations.”

DATA SOURCES

The raw material is Form 990-PF, the annual return that private foundations file with the IRS. The IRS releases the 990 corpus in bulk as XML files, and GivingTuesday mirrors and archives the corpus in a public S3 bucket.

We extracted grant-level detail using a Python script. The code reads the XML, isolates the grants-paid schedule, filters to Form 990-PF only (so the corpus is private foundations, not operating charities), and writes the result to a single flat file. We then ran two parallel iterative cleaning passes on that file: one to normalize the recipient names so that the same university does not appear under variant spellings, and one to add a thematic tag to each grant using AI (Claude Haiku 4.5). Code, prompts, and replication instructions for both passes are in the technical appendix.


Fast Lane Literacy by sedso