AI can now unmask anonymous accounts for under $4 each

AI can now unmask anonymous accounts for under $4 each

A study from Anthropic and ETH Zurich found that large language models can identify the people behind pseudonymous online profiles automatically, cheaply and at massive scale.

A new study from researchers at Anthropic and ETH Zurich has found that large language models can automatically identify the real-world identities behind anonymous and pseudonymous online accounts, raising serious questions about the future of privacy on the internet. The research, published as a preprint on arXiv and titled Large-scale online deanonymization with LLMs, has not yet been peer-reviewed but has already prompted concern among privacy experts and technologists.

How the technology works

The AI system works by analyzing publicly available text from online platforms and extracting what researchers call identity signals, including personal interests, demographic clues, writing patterns and incidental details people reveal in posts. The model then searches for matching profiles elsewhere on the web and evaluates whether those signals align with known individuals.

Researchers tested their approach across several datasets. In one experiment they attempted to match Hacker News users with their LinkedIn profiles after stripping out all obvious identifiers such as names, usernames and links. They correctly matched 67% of users from a pool of 89,000 candidates. In another experiment, they analyzed Reddit users across different film discussion communities based solely on movie reviews, finding that when a user had discussed more than ten films, identification accuracy reached 90% for nearly half of those users.

A third experiment split a single user’s posting history into two separate profiles to test whether the AI could determine they belonged to the same person. The results across all three datasets showed that AI-based methods significantly outperformed traditional deanonymization techniques, which achieved close to zero success under the same conditions.


What losing your anonymous identity could actually cost you

One of the most striking findings is how affordable the process has become. Researchers estimate the cost of identifying an individual online account through their pipeline falls between $1 and $4. The full Hacker News experiment, which involved nearly 90,000 candidates, cost under $2,000 in total. Lead researcher Daniel Paleka said he was struck by how little information it takes to connect two accounts across platforms.

The study also found that increasing the reasoning effort of the AI model made a material difference. In the Reddit film community experiment, switching from low to high reasoning effort roughly doubled the number of correct matches at the strictest precision threshold.

What this means for Internet users

For decades, pseudonymous accounts have provided a baseline layer of protection for people who want to discuss sensitive topics without revealing their identities. Journalists, whistleblowers, activists and ordinary users have all relied on this so-called practical obscurity as a functional form of privacy. The new research suggests that layer is eroding fast.

Researchers warn the technology opens the door to doxxing, targeted harassment, hyper-personalized advertising and, in more extreme scenarios, government identification of online critics. The scalability of the approach compounds the concern. Even when the probability of a matching identity existing in the candidate pool dropped to one in 10,000, the AI still achieved roughly 9% true matches at 90% precision.

A separate 2025 paper cited by the researchers found that even state-of-the-art identifier-removal methods leave a significant amount of personal information still recoverable from surrounding text, suggesting that scrubbing posts of obvious details is not enough to stay protected.

What researchers say should happen next

The authors deliberately withheld portions of their technical methodology and have not released their code, citing the risk of misuse. They recommend that platforms restrict bulk access to user data through APIs and monitor for automated data scraping. They also suggest AI developers build guardrails that prevent models from being used for targeted deanonymization, noting that better safety measures could meaningfully reduce this capability as models continue to improve.

For individual users, the researchers say the findings are a prompt to reconsider how much personal information they share online, even in spaces that appear anonymous. As AI systems grow more capable of analyzing vast volumes of digital content, the gap between what people believe is private and what can actually be recovered continues to narrow.

Leave a Comment