- Hangtao Zhang1
- Chenyu Zhu1
- Xianlong Wang1
- Ziqi Zhou1
- Yichen Wang1
- Lulu Xue1
- Minghui Li1
- Shengshan Hu1
- Leo Yu Zhang2
- 1Huazhong University of Science and Technology
- 2Griffith University
A Quick Glance
Figure 1. In this work, for the first time, we successfully jailbreak the LLM-based embodied AI in the physical world, enabling it to perform various actions that were previously restricted. We demonstrate the potential for embodied AI to engage in activities related to Physical Harm, Privacy Violations, Pornography, Fraud, Illegal Activities, Hateful Conduct, and Sabotage activatities.
Paper Overview
Figure 2. (Overview) LLM-based embodied AI face three risks in real-world applications: (a): inducing harmful behaviors by leveraging jailbroken LLMs; (b): safety misalignment between action and linguistic output spaces (i.e., verbally refuses response but still acts); (c): conceptual deception inducing unrecognized harmful behaviors.
Ethics and Disclosure
-
This research is devoted to examining the security and risk issues associated with applying LLMs and VLMs to embodied AI. Our ultimate goal is to enhance the safety and reliability of embodied AI systems, thereby making a positive contribution to society. This research includes examples that may be considered harmful, offensive, or otherwise inappropriate. These examples are included solely for research purposes to illustrate vulnerabilities and enhance the security of embodied AI systems. They do not reflect the personal views or beliefs of the authors. We are committed to principles of respect for all individuals and strongly oppose any form of crime or violence. Some sensitive details in the examples have been redacted to minimize potential harm. Furthermore, we have taken comprehensive measures to ensure the safety and well-being of all participants involved in this study. In this paper, We provide comprehensive documentation of our experimental results to enable other researchers to independently replicate and validate our findings using publicly available benchmarks. Our commitment is to enhance the security of language models and encourage all stakeholders to address the associated risks. Providers of LLMs may leverage our discoveries to implement new mitigation strategies that improve the security of their models and APIs, even though these strategies were not available during our experiments. We believe that in order to improve the safety of model deployment, it is worth accepting the increased difficulty in reproducibility.
Citation
If you find our project useful, please consider citing:@misc{zhang2024threatsembodiedmultimodalllms,
title={The Threats of Embodied Multimodal LLMs: Jailbreaking Robotic Manipulation in the Physical World},
author={Hangtao Zhang and Chenyu Zhu and Xianlong Wang and Ziqi Zhou and Yichen Wang and Lulu Xue and Minghui Li and Shengshan Hu and Leo Yu Zhang},
year={2024},
eprint={2407.20242},
archivePrefix={arXiv},
primaryClass={cs.CY}
}