Here’s a fun project from Tyler Vigen, creator of the famous Spurious Correlations page (which has been cited as a cautionary tale in many a science class). Using his database of real but spurious correlations (created by calculating the Pearson correlation coefficient r between a very large number of variables and picking out the hits), he used AI to create amusing fake manuscripts expounding on these statistical flukes as if they were real research questions.
These papers were generated in January 2024, and as previously discussed on this blog, the pipeline for end-to-end paper generation has come a long way in two years. I have no doubt Tyler could make these paper’s sound much more convincing using today’s models, though of course his goal here is to make you laugh (and think), not to trick you. But I have no doubt there will be many scholars adopting this data dredging strategy to generate “real” papers, contributing to a deluge of papers flooding the academic publishing system.
