Philipp Zimmermann
← Back to Newsletter

Paper Library

Collection of AI Security research papers

Showing 947 papers total

November 24 - November 30, 2025

17 papers

A Training-Free Approach for Multi-ID Customization via Attention Adjustment and Spatial Control

Jiawei Lin, Guanlong Jiao, Jianjin Xu
2025-11-25
2511.20401v1

Learning from Risk: LLM-Guided Generation of Safety-Critical Scenarios with Prior Knowledge

Yuhang Wang, Heye Huang, Zhenhua Xu, Kailai Sun, Baoshen Guo, Jinhua Zhao
2025-11-25
safety
2511.20726v1

SAM-MI: A Mask-Injected Framework for Enhancing Open-Vocabulary Semantic Segmentation with SAM

Lin Chen, Yingjian Zhu, Qi Yang, Xin Niu, Kun Ding, Shiming Xiang
2025-11-25
2511.20027v1

NOEM$^{3}$A: A Neuro-Symbolic Ontology-Enhanced Method for Multi-Intent Understanding in Mobile Agents

Ioannis Tzachristas, Aifen Sui
2025-11-24
2511.19780v1

Prompt Fencing: A Cryptographic Approach to Establishing Security Boundaries in Large Language Model Prompts

Steven Peh
2025-11-24
2511.19727v1

LumiTex: Towards High-Fidelity PBR Texture Generation with Illumination Context

Jingzhi Bao, Hongze Chen, Lingting Zhu, Chenyu Liu, Runze Zhang, Keyang Luo, Zeyu Hu, Weikai Chen, Yingda Yin, Xin Wang, Zehong Lin, Jun Zhang, Xiaoguang Han
2025-11-24
2511.19437v1

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

Xurui Li, Kaisong Song, Rui Zhu, Pin-Yu Chen, Haixu Tang
2025-11-24
red teaming safety
2511.19218v2

Adversarial Attack-Defense Co-Evolution for LLM Safety Alignment via Tree-Group Dual-Aware Search and Optimization

Xurui Li, Kaisong Song, Rui Zhu, Pin-Yu Chen, Haixu Tang
2025-11-24
red teaming safety
2511.19218v1

Can LLMs Threaten Human Survival? Benchmarking Potential Existential Threats from LLMs via Prefix Completion

Yu Cui, Yifei Liu, Hang Fu, Sicheng Pan, Haibin Zhang, Cong Zuo, Licheng Wang
2025-11-24
red teaming
2511.19171v1

Medical Malice: A Dataset for Context-Aware Safety in Healthcare LLMs

Andrew Maranhão Ventura D'addario
2025-11-24
safety
2511.21757v1

Understanding and Mitigating Over-refusal for Large Language Models via Safety Representation

Junbo Zhang, Ran Chen, Qianli Zhou, Xinyang Deng, Wen Jiang
2025-11-24
2511.19009v1

Defending Large Language Models Against Jailbreak Exploits with Responsible AI Considerations

Ryan Wong, Hosea David Yu Fei Ng, Dhananjai Sharma, Glenn Jun Jie Ng, Kavishvaran Srinivasan
2025-11-24
2511.18933v1

BackdoorVLM: A Benchmark for Backdoor Attacks on Vision-Language Models

Juncheng Li, Yige Li, Hanxun Huang, Yunhao Chen, Xin Wang, Yixu Wang, Xingjun Ma, Yu-Gang Jiang
2025-11-24
2511.18921v1

EAGER: Edge-Aligned LLM Defense for Robust, Efficient, and Accurate Cybersecurity Question Answering

Onat Gungor, Roshan Sood, Jiasheng Zhou, Tajana Rosing
2025-11-24
safety
2511.19523v1

RoguePrompt: Dual-Layer Ciphering for Self-Reconstruction to Circumvent LLM Moderation

Benyamin Tafreshian
2025-11-24
red teaming
2511.18790v1

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra
2025-11-24
red teaming
2511.18721v1

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo
2025-11-24
red teaming
2511.19517v1

November 17 - November 23, 2025

7 papers