Selected talks
- StrangeLoop – Software With a Mission, 2018: From Coder to Bureaucrat (co-presenter)
- FiveCollege Stats and Data Science Research Bytes: Columnar Data Storage and the Separation of Compute and Storage, 2018
- Western MA Full Stack Meetup: The Humble Dataframe
- DATA Act Summit, 2017: Leveraging Technology to Modernize Government (panel)
- DATA Act Training, 2017: Why the DATA Act Matters: New Uses for Standardized Spending Information
- 18F: Data Munging With Python and Pandas
- New England Regional Developers (NERD) Summit: Intro to IPython Notebooks
- Association of Public Data Users (APDU): Transitioning to Open Government Data
- Wharton School of Business: Elegant and Efficient Database Design
Selected projects
DATA Act: USAspending API
Led the buildout of an API that presents, for the first time, a unified view of U.S. spending data. In addition to being available for public use, the API powers usaspending.gov.
DATA Act: data model, ingest, and validation
Prototyped and served as the initial tech lead for the “DATA Act Broker,” a product that federal agencies use to submit and validate their spending data to the U.S. Department of the Treasury. Over a period of two years, the broker evolved from a series of Python scripts to a Jupyter notebook, and finally to a mature, API-driven data ingest site that performs hundreds of validations on incoming files and is used by dozens of federal agencies.
- https://broker.usaspending.gov/ (login required)
- Source code: https://github.com/fedspendingtransparency/data-act-broker-backend
Pandas snippets
Pandas is a powerful data analysis toolkit. But sometimes you just need a quick reference for doing common data cleaning and munging tasks.
State Smart
Because a single source of federal spending data was not available back in 2015, I designed a website that pulled together many disparate data sources to tell the story of how federal dollars flow into and out of states. In addition to writing the narrative, I researched the best sources for the required information (making some hard decisions along the way) and coded the ETL (extract, transform, and load) needed to power the site.
The launch of State Smart was covered by the Washington Post, and the data was subsequently cited by dozens of other publications, including Congressional Research Service.
- https://www.nationalpriorities.org/smart/
- Source code: proprietary
Hack for Western Mass
The inaugural National Day of Civic Hacking happened in 2013. Rather than travel to New York or Boston to participate, I gathered a team of local technologists to organize our own Western Massachusetts-flavored hackathon. I served on the organizing committee in 2013 and 2014 before passing the torch.
- Internet Archive
- Source code: https://github.com/hackforwesternmass