Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation
arXiv:2511.12779v2 Announce Type: replace-cross Abstract: We study the problem of efficiently estimating policies that simultaneously optimize multiple objectives in reinforcement learning (RL). Given $n$ objectives


