The data nerds are fighting back.
After watching data sets be altered or disappear from U.S. government websites in unprecedented ways after President Donald Trump began his second term, an army of outside statisticians, demographers and computer scientists have joined forces to capture, preserve and share data sets, sometimes clandestinely.
Their goal is to make sure they are available in the future, believing that democracy suffers when policymakers don’t have reliable data and that national statistics should be above partisan politics.
“There are such smart, passionate people who care deeply about not only the Census Bureau, but all the statistical agencies, and ensuring the integrity of the statistical system. And that gives me hope, even during these challenging times,” Mary Jo Mitchell, director of government and public affairs for the research nonprofit the Population Association of America, said this week during an online public data-users conference.
The threats to the U.S. data infrastructure since January have come not only from the disappearance or modification of data related to gender, sexual orientation, health, climate change and diversity, among other topics, but also from job cuts of workers and contractors who had been guardians of restricted-access data at statistical agencies, the data experts said.
“There are trillions of bytes of data files, and I can't even imagine how many public dollars were spent to collect those data. ... But right now, they're sitting someplace that is inaccessible because there are no staff to appropriately manage those data,” Jennifer Park, a study director for the Committee on National Statistics, National Academies of Sciences, Engineering, and Medicine, said during the conference hosted by the Association of Public Data Users (APDU).
In February, the Center for Disease Control and Prevention's official public portal for health data, data.cdc.gov, was taken down entirely but subsequently went back up. Around the same time, when a query was made to access certain public data from the U.S. Census Bureau's most comprehensive survey of American life, users for several days got a response that said the area was "unavailable due to maintenance" before access was restored.
Researchers Janet Freilich and Aaron Kesselheim examined 232 federal public health data sets that had been modified in the first quarter of this year and found that almost half had been "substantially altered," with the majority having the word "gender" switched to "sex," they wrote this month in The Lancet medical journal.
One of the most difficult tasks has been figuring out what's been changed since many of the alterations weren't recorded in documentation.
Beth Jarosz, senior program director at the Population Reference Bureau, thought she was in good shape since she had previously downloaded data she needed from the National Survey of Children's Health for a February conference where she was speaking, even though the data had become unavailable. But then she realized she had failed to download the questionnaire and later discovered that a question about discrimination based on gender or sexual identity had been removed.
“It's the one thing my team didn't have,” Jarosz said at this week's APDU conference. “And they edited the questionnaire document, which should have been a historical record.”
Among the groups that have formed this year to collect and preserve the federal data are the Federation of American Scientists' dataindex.com, which monitors changes to federal data sets; the University of Chicago Library's Data Mirror website, which backs up and hosts at-risk data sets; the Data Rescue Project, which serves as a clearinghouse for data rescue-related efforts; and the Federal Data Forum, which shares information about what federal statistics have gone missing or been modified — a job also being done by the American Statistical Association.
The outside data warriors also are quietly reaching out to workers at statistical agencies and urging them to back up any data that is restricted from the public.
“You can't trust that this data is going to be here tomorrow,” said Lena Bohman, a founding member of the Data Rescue Project.
Separately, a group of outside experts has unofficially revived a long-running U.S. Census Bureau advisory committee that was killed by the Trump administration in March.
Census Bureau officials won't be attending the Census Scientific Advisory Committee meeting in September, since the Commerce Department, which oversees the agency, eliminated it. But the advisory committee will forward its recommendations to the bureau, and demographer Allison Plyer said she has heard that some agency officials are excited by the committee's re-emergence, even if it's outside official channels.
“We will send them recommendations but we don't expect them to respond since that would be frowned upon,” said Plyer, chief demographer at The Data Center in New Orleans. “They just aren't getting any outside expertise ... and they want expertise, which is understandable from nerds.”
___
Follow Mike Schneider on the social platform Bluesky: @mikeysid.bsky.social
Copyright 2025 The Associated Press. All rights reserved. This material may not be published, broadcast, rewritten or redistributed without permission.