the-instruction-hierarchy-training-llms-to-prioritize-privileged-instructions.log

Apr 19, 2024|src: openai.com

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Today's LLMs are susceptible to prompt injections, jailbreaks, and other attacks that allow adversaries to overwrite a model's original instructions with their own malicious prompts.

>open_source--originopenai.com