arXiv:2512.11781v1 Announce Type: cross
Abstract: Through multi-agent competition and the sparse high-level objective of winning a race, we find that both agile flight (e.g., high-speed motion pushing the platform to its physical limits) and strategy (e.g., overtaking or blocking) emerge from agents trained with reinforcement learning. We provide evidence in both simulation and the real world that this approach outperforms the common paradigm of training agents in isolation with rewards that prescribe behavior, e.g., progress on the raceline, in particular when the complexity of the environment increases, e.g., in the presence of obstacles. Moreover, we find that multi-agent competition yields policies that transfer more reliably to the real world than policies trained with a single-agent progress-based reward, despite the two methods using the same simulation environment, randomization strategy, and hardware. In addition to improved sim-to-real transfer, the multi-agent policies also exhibit some degree of generalization to opponents unseen at training time. Overall, our work, following in the tradition of multi-agent competitive game-play in digital domains, shows that sparse task-level rewards are sufficient for training agents capable of advanced low-level control in the physical world.
Code: https://github.com/Jirl-upenn/AgileFlight_MultiAgent
Accelerometer-Derived Rest-Activity Rhythm Amplitude, Genetic Predisposition, and the Risk of Ischemic Heart Disease: Observational and Mendelian Randomization Study
Background: The rest-activity rhythm amplitude (RARA), as a fundamental human behavior, has been linked to various health conditions. However, its causal relationship with ischemic heart


