← Back to Library

Ignore All Previous Instructions: Jailbreaking as a de-escalatory peace building practise to resist LLM social media bots

Authors: Huw Day, Adrianna Jezierska, Jessica Woodgate

Published: 2026-03-02

arXiv ID: 2603.01942v1

Added to Library: 2026-03-03 05:00 UTC

📄 Abstract

Large Language Models have intensified the scale and strategic manipulation of political discourse on social media, leading to conflict escalation. The existing literature largely focuses on platform-led moderation as a countermeasure. In this paper, we propose a user-centric view of "jailbreaking" as an emergent, non-violent de-escalation practice. Online users engage with suspected LLM-powered accounts to circumvent large language model safeguards, exposing automated behaviour and disrupting the circulation of misleading narratives.

🤖 AI Analysis

AI analysis is not available for this paper. This may be because the paper was not deemed relevant for AI security topics, or the analysis failed during processing.

📚 Read the Full Paper