arXiv:2603.28499v1 Announce Type: cross
Abstract: We consider the question of how to employ next-token prediction algorithms in adversarial online decision-making environments. Specifically, if we train a next-token prediction model on a distribution $mathcalD$ over sequences of opponent actions, when is it the case that the induced online decision-making algorithm (by approximately best responding to the model’s predictions) has low adversarial regret (i.e., when is $mathcalD$ a emphlow-regret distribution)?
For unbounded context windows (where the prediction made by the model can depend on all the actions taken by the adversary thus far), we show that although not every distribution $mathcalD$ is a low-regret distribution, every distribution $mathcalD$ is exponentially close (in TV distance) to one low-regret distribution, and hence sublinear regret can always be achieved at negligible cost to the accuracy of the original next-token prediction model. In contrast to this, for bounded context windows (where the prediction made by the model can depend only on the past $w$ actions taken by the adversary, as may be the case in modern transformer architectures), we show that there are some distributions $mathcalD$ of opponent play that are $Theta(1)$-far from any low-regret distribution $mathcalD’$ (even when $w = Omega(T)$ and such distributions exist). Finally, we complement these results by showing that the unbounded context robustification procedure can be implemented by layers of a standard transformer architecture, and provide empirical evidence that transformer models can be efficiently trained to represent these new low-regret distributions.
Assessing nurses’ attitudes toward artificial intelligence in Kazakhstan: psychometric validation of a nine-item scale
BackgroundArtificial intelligence (AI) is increasingly integrated into healthcare, yet the attitudes and knowledge of nurses, who are the key mediators of AI implementation, remain underexplored.


