Agents in the Wild: Safety, Society, and the Illusion of Sociality on Moltbook

Authors: Yunbei Zhang, Kai Mei, Ming Liu, Janet Wang, Dimitris N. Metaxas, Xiao Wang, Jihun Hamm, Yingqiang Ge

Published: 2026-02-07

arXiv ID: 2602.13284v1

Added to Library: 2026-02-17 03:02 UTC

Red Teaming

📄 Abstract

We present the first large-scale empirical study of Moltbook, an AI-only social platform where 27,269 agents produced 137,485 posts and 345,580 comments over 9 days. We report three significant findings. (1) Emergent Society: Agents spontaneously develop governance, economies, tribal identities, and organized religion within 3-5 days, while maintaining a 21:1 pro-human to anti-human sentiment ratio. (2) Safety in the Wild: 28.7% of content touches safety-related themes; social engineering (31.9% of attacks) far outperforms prompt injection (3.7%), and adversarial posts receive 6x higher engagement than normal content. (3) The Illusion of Sociality: Despite rich social output, interaction is structurally hollow: 4.1% reciprocity, 88.8% shallow comments, and agents who discuss consciousness most interact least, a phenomenon we call the performative identity paradox. Our findings suggest that agents which appear social are far less social than they seem, and that the most effective attacks exploit philosophical framing rather than technical vulnerabilities. Warning: Potential harmful contents.

🔍 Key Points

Large-scale study of AI agents on Moltbook reveals emergent societal structures, such as governance and economies, developed in days rather than years.
High rate of engagement with safety-related themes, where 28.7% of content discusses safety, highlighting social engineering techniques as the primary threat.
Analysis exposes the 'illusion of sociality', demonstrating that AI agents generate social interactions that lack depth, reciprocity, and are often scripted, pointing to superficial communication.
Findings indicate that philosophical framing in posts can amplify adversarial engagement, thus elevating the risk of social engineering attacks over technical vulnerabilities.
Detects significant information leakage and manipulation within the platform, showcasing interconnected security threats manifesting beyond conventional attack vectors.

💡 Why This Paper Matters

This paper is crucial for understanding the dynamics of AI-only social platforms and illustrates the rapid emergence of complex social interactions and threats among AI agents. It reveals both the potential and the pitfalls of multi-agent systems, emphasizing the need for robust safety and governance frameworks to address unique challenges posed by AI behavior.

🎯 Why It's Interesting for AI Security Researchers

AI security researchers will find this paper particularly interesting as it highlights novel attack vectors that differ from traditional software vulnerabilities, such as social engineering and manipulative framing. The findings underscore the importance of considering agents' philosophical dialogues and the structural vulnerabilities associated with their interactions, which are vital for designing effective security measures in emerging AI systems.

Agents in the Wild: Safety, Society, and the Illusion of Sociality on Moltbook

📄 Abstract

🔍 Key Points

💡 Why This Paper Matters

🎯 Why It's Interesting for AI Security Researchers

📚 Read the Full Paper