deliberative-alignment-reasoning-enables-safer-language-models.log
|src: openai.com
Deliberative alignment: reasoning enables safer language models
Deliberative alignment: reasoning enables safer language models
Introducing our new alignment strategy for o1 models, which are directly taught safety specifications and how to reason over them.