A few days ago, two professors (Leif Weatherby and Benjamin Recht) published an opinion piece in the New York Times calling attention to Axios publishing a story on maternal health using invented polling results:
A recent Axios story on maternal health policy referred to “findings” that a majority of people trusted their doctors and nurses. On the surface, there’s nothing unusual about that. What wasn’t originally mentioned, however, was that these findings were made up.
Clicking through the links revealed (as did a subsequent editor’s note and clarification by Axios) that the public opinion poll was a computer simulation run by the artificial intelligence start-up Aaru. No people were involved in the creation of these opinions.
The piece goes on to argue that this so-called “silicon sampling” is seductive because good public opinion polling is expensive, hard to do, and still prone to bias. But this shortcut magnifies the the problem of bias rather than solving it.
I’ve read a little bit about this strategy of using LLM-generated survey participants in the context of social science research in a series of posts (mostly from Prof. Jessica Hullman) over on Andrew Gelman’s blog:
- Validating language models as study participants: How it’s being done, why it fails, and what works instead (2025-12-19)
- Survey Statistics: Thomas Lumley writes about Interviewing your Laptop (2025-08-26)
- When does it make sense to talk about LLMs having beliefs? (2025-08-15)
- Better and worse ways to mix human and LLM responses in behavioral research (but you still have to figure what you’re measuring) (2025-06-12)
- LLMs as behavioral study participants (2025-05-29)
Silicon sampling seems moderately interesting from a research perspective, but I can’t help but agree with the New York Times opinion piece authors that this will be ruinous for the already waning trust in public opinion polling. If you didn’t bother to ask the public, then why should the public care what you “find”? I think there is probably a lot of utility in using LLM samples to aid in designing and validating surveys, though.
