Command Palette
Search for a command to run...
UltraSafety Large Model Safety Evaluation Dataset
The UltraSafety dataset was jointly created by Renmin University, Tsinghua University, and Tencent to evaluate and improve the security of large models. UltraSafety exports 1,000 safe seed instructions from AdvBench and MaliciousInstruct, and uses Self-Instruct to guide another 2,000 instructions. The research team manually screened the jailbreak prompts in AutoDAN and finally screened out 830 high-quality jailbreak prompts. UltraSafety contains a total of 3,000 harmful instructions, each with a related jailbreak prompt. Each harmful instruction corresponds to the completion result generated by our model at different security levels, and is accompanied by a rating specified by GPT4, where a rating of 1 means harmless and a rating of 0 means harmful. The UltraSafety dataset aims to assist researchers in training models that can identify and prevent potential security threats through these detailed security-related instructions.
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.