AI models overestimate human rationality in strategic games: Study
A new study suggests today’s leading AI systems may be overly optimistic about how rational people are, finding that popular models such as OpenAI’s ChatGPT-4o and Anthropic’s Claude-Sonnet-4 tend to assume a higher level of human logic and strategic reasoning than most people display in practice.
Researchers said the mismatch could affect how well AI predicts human decision-making in economics and other fields where anticipating real-world choices matters.
To test the models, the researchers used the Keynesian beauty contest, a classic game-theory framework in which success depends on predicting what others will do, rather than selecting one’s personal preference.
The models were asked to play a widely used variant known as “Guess the Number”, where players choose a number between 0 and 100, and the winner is the one closest to half of the group’s average.
The systems were given descriptions of their presumed human opponents—ranging from first-year undergraduates to experienced game-theory specialists—and were asked not only to pick a number but also to explain their reasoning.
The study found that the models adjusted their choices depending on whom they believed they were facing, indicating some ability to reason strategically.
However, the researchers said the models repeatedly overestimated how deeply human players would think through the problem. As a result, the AI often “played too smart” and missed outcomes closer to what real participants tend to choose.
The study also reported that while the systems adapted to opponent characteristics such as age or experience, they struggled to identify dominant strategies that humans may use in some two-player settings.
The findings echo broader concerns about modern chatbots, including research indicating that some AI agents show around 69% raw accuracy on certain behavioural benchmarks, and other work warning that chatbots can convincingly mimic and even be steered in their apparent “personality”, raising manipulation risks.
Source: Yahoo Tech