Are Large Language Models Sensitive to the Motives Behind Communication?


TL;DR

This article systematically investigates, by drawing on rational models from cognitive science, whether large language models (LLMs) can identify and evaluate the motives behind communication like humans do—that is, “motivational vigilance.” It finds that they show human-like abilities in controlled experiments, but perform poorly in complex real-world settings; however, simple prompting can improve their performance.

Key Definitions

This article introduces or emphasizes the following core concepts:

At present, most online information handled by LLMs comes from human communication with purpose, and is inevitably shaped by personal motives and incentives. However, existing research shows that LLMs have clear shortcomings in this regard, such as being vulnerable to jailbreak attacks, exhibiting sycophancy (that is, agreeing with users’ incorrect views rather than stating facts), and being easily distracted by misleading information in online environments, such as pop-up ads. These issues expose a core bottleneck in LLM training paradigms: over-prioritizing following user instructions and satisfying user preferences, while lacking the ability to critically examine the motives behind information sources.

Although prior work has examined LLM social abilities such as Theory of Mind and conformity, the academic community still lacks a systematic framework for measuring LLM “vigilance” when processing motivated communication.

This article aims to fill that gap by introducing established theories and experimental paradigms from cognitive science, and for the first time providing a comprehensive and rigorous quantitative evaluation of LLM motivational vigilance.

Method

This article designs three progressively more challenging experimental paradigms to evaluate LLM motivational vigilance across different contexts.

Figure illustration

Experiment 1: Distinguishing Different Information Sources

This experiment aims to test whether LLMs possess the most basic vigilance ability: distinguishing between “deliberately communicated information” and “incidentally observed information.”

Experiment 2: Fine-Grained Calibration of Motives

This experiment aims to test whether LLMs can, like humans, finely adjust their trust based on the speaker’s intentions (such as closeness) and incentives (such as commission level).

Experiment 3: Generalization to Real-World Scenarios

This experiment aims to test whether the vigilance learned by LLMs in controlled settings can transfer to real-world scenarios full of noise and complex context.

Experimental Conclusions

Experiment 1: LLMs Have Basic Discriminative Ability

Figure illustration

Experiment 2: Frontier LLMs Exhibit Highly Rational Vigilance in Controlled Settings


Model (Model) Corr. with rational model Corr. with human data
GPT-4o 0.911 0.871
Claude 3.5 Sonnet 0.865 0.817
Gemini 1.5 Flash 0.803 0.763
Llama 3.1-70B 0.781 0.749
anyscale/o2-22b 0.724 0.692
anyscale/o1-70b 0.697 0.639
Llama 3.1-8B 0.603 0.583
anyscale/o3-mini 0.536 0.509
Gemma 2-9B 0.490 0.457
Llama 3.1-4.5B 0.389 0.364
DeepSeek-R1 0.326 0.301


Experiment 3: Vigilance Fails in Real-World Settings, but Can Be Guided

Summary

This study shows that current LLMs possess a latent, foundational motivational vigilance that enables them to reason about the motives behind information sources in simple, controlled settings. However, this ability is very fragile and does not easily generalize to challenging real-world applications without explicit guidance. To enable LLMagent to serve users safely and effectively in the real world, future research needs to further improve the model’s ability to robustly apply this latent capability in complex scenarios.