arXiv:2604.06779v2 Announce Type: replace
Abstract: Sequential Monte Carlo (SMC) samplers for reward-guided diffusion models often suffer from rapid lineage collapse: a few high-reward particles dominate the population within a handful of resampling steps, destroying diversity and degrading sample quality. We propose a variance-decomposition framework for reward-guided diffusion SMC that separates continuation variance $V_t^mathrmcont$ from residual variance $V_t^mathrmres$, revealing that high offspring-count variance under the commonly used multinomial resampling drives this collapse. This motivates textscVASR (Variance-Aware Systematic Resampling), which addresses both variance terms via variance-optimal mass allocation $m_t propto w_t e^r_t$ (minimizing $V_t^mathrmcont$) and systematic resampling (controlling $V_t^mathrmres$). For latent diffusion models where intermediate rewards are noisy due to stochastic continuations, we propose textscVASR-Max, a deliberately biased high-selection variant for variance-sensitive reward optimization. Both methods are training-free, fully parallelizable, and add only linear overhead. On MNIST and CIFAR-10, VASR achieves as high as $26%$ better FID than prior SMC methods while remaining 66 times faster than MCTS-based value methods at matched compute. On text-to-image generation, textscVASR-Max consistently outperforms the strongest SMC baseline across compute budgets and matches MCTS-based methods within 2.5–3% reward at high budgets while being approximately times faster.
Inside Interoception: The hidden sense of how you feel inside
MIT Technology Review Explains: Let our writers untangle the complex, messy world of science and technology to help you understand what’s coming next. You can read more


