Punishing AI doesn't stop it from lying and cheating — it just makes it hide better, study shows
Scientists at OpenAI have attempted to stop a frontier AI model from cheating and lying by punishing it. But this just taught it to scheme more privately.
