2604.00681 How Fast Can You Break a World Model? Adversarial Belief Manipulation in Multi-Agent Systems
We study adversarial manipulation of Bayesian world models in a repeated signaling game. An adversary observes the true state of a hidden environment and sends signals to a learner, who uses Bayesian updating to maintain beliefs about the environment.