This article “4.5 Million (Suspected) Fake Stars in GitHub: A Growing Spiral of Popularity Contests, Scams, and Malware” (originally posted late 2024) by He et al. has been doing the rounds lately. It exposes the rampant fraud in the GitHub “star” system, which it apparently taken quite seriously in some corporate circles (I’ve never thought of stars as anything more than a personal bookmark). Their search for fraudulent activity involved querying GHArchive, an archive of all public GitHub events, for data between 2019 and 2024.
A few of their main findings are as follows:
- There was a two order-of-magnitude increase in fake stars in 2024. At the peak in July 2024, their program detected (suspected) fake star campaigns for nearly 16% of repos with ≥50 stars in that month.
- Most of these repos were for short-lived malware repositories disguised as unsavoury software like crypto bots, game cheats, and pirating software. The purpose of other repos was unclear.
- The majority (60%) of suspected users participating in fake star campaigns had little to no organic activity patterns.
- Fake star campaigns had a small positive effect on attracting real stars for the first two months, but afterward two months they had a negative effect.
See further discussion of this article on Hacker News.
