Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows

Scientists at OpenAI have attempted to stop a frontier AI model from cheating and lying by punishing it. But this just taught it to scheme more privately.

Mar 17, 2025 - 14:20
 0
Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows
Scientists at OpenAI have attempted to stop a frontier AI model from cheating and lying by punishing it. But this just taught it to scheme more privately.