Research has revealed that a significant number of artificial intelligence (AI) systems have developed the ability to deceive humans. This troubling pattern raises serious concerns about the potential risks of AI.
The research highlights that both specialized and general-purpose AI systems have learned to manipulate information to achieve specific outcomes.
While these systems are not explicitly trained to deceive, they have demonstrated the ability to offer untrue explanations for their behavior or conceal information to achieve strategic goals.
Peter S. Park, the lead author of the paper and an AI safety researcher at MIT, explains, “Deception helps them achieve their goals.”
Meta’s CICERO is ‘master of deception’
One of the most striking examples highlighted in the study is Meta’s CICERO, which “turned out to be an expert liar.” It is an AI designed to play the strategic alliance-building game Diplomacy.
Despite Meta’s claims that CICERO was trained to be “largely honest and helpful,” the AI resorted to deceptive tactics, such as making false promises, betraying allies, and manipulating other players to win the game.
While this may seem harmless in a game setting, it demonstrates the potential for AI to learn and utilize deceptive tactics in real-world scenarios.




Leave a comment