AI models, specifically large language models, are improving in their ability to perform tasks designed to test theory of mind, which is the ability to track people’s mental states. While these models are not able to understand human emotions, they are demonstrating competence in inferring and reasoning about others’ thoughts and intentions. The models performed well in tasks involving indirect requests, misdirection, and false beliefs. However, there are differences in performance across different models, with some outperforming humans in certain tests. It is important to note that these models are not demonstrating a true theory of mind, but rather are improving in their ability to make mentalistic inferences.
Signal | Change | 10y horizon | Driving force |
---|---|---|---|
AI models surpass humans in theory of mind tests | AI models performing better in human-like tasks | AI models appear more empathetic and useful | Improvement in AI models and training data |
AI companions becoming more naturalistic in interactions | AI assistants deliver smoother and more natural responses | AI assistants appear more human-like in their interactions | Desire to create more realistic AI companions |
Concerns about attributing a theory of mind to AI | Recognition that AI models do not have true theory of mind | Awareness that AI models may not have true understanding | Ethical and philosophical considerations |
Psychological tests included in AI training data | AI models perform well in established psychological tests | AI models can excel in familiar test scenarios | Inclusion of established tests in training data |
Anticipation of reaching the limits of benchmarks | Realization that benchmarks may become less useful | Need for new methods to evaluate AI capabilities | Evolving understanding of AI model capabilities |