arXiv:2601.20554v1 Announce Type: new
Abstract: We study risk-sensitive planning under partial observability using the dynamic risk measure Iterated Conditional Value-at-Risk (ICVaR). A policy evaluation algorithm for ICVaR is developed with finite-time performance guarantees that do not depend on the cardinality of the action space. Building on this foundation, three widely used online planning algorithms–Sparse Sampling, Particle Filter Trees with Double Progressive Widening (PFT-DPW), and Partially Observable Monte Carlo Planning with Observation Widening (POMCPOW)–are extended to optimize the ICVaR value function rather than the expectation of the return. Our formulations introduce a risk parameter $alpha$, where $alpha = 1$ recovers standard expectation-based planning and $alpha < 1$ induces increasing risk aversion. For ICVaR Sparse Sampling, we establish finite-time performance guarantees under the risk-sensitive objective, which further enable a novel exploration strategy tailored to ICVaR. Experiments on benchmark POMDP domains demonstrate that the proposed ICVaR planners achieve lower tail risk compared to their risk-neutral counterparts.
Meet the Vitalists: the hardcore longevity enthusiasts who believe death is “wrong”
“Who here believes involuntary death is a good thing?” Nathan Cheng has been delivering similar versions of this speech over the last couple of years,



