arXiv:2603.30022v1 Announce Type: cross Abstract: This paper introduces a new hybrid framework that combines Reinforcement Learning (RL) and Large Language Models (LLMs) to improve robotic manipulation tasks. By utilizing RL for accurate low-level control and LLMs for high level task planning and understanding of natural language, the proposed framework effectively connects low-level execution with high-level […]
From seasons to decades: Solar radiation, cloud cover, and CO$_2$ shape young leaf phenology in a tropical forest over 26 years
arXiv:2501.07620v2 Announce Type: replace Abstract: 1. Climate change is altering plant phenology globally with potential deleterious impacts on animal species and entire ecosystems, yet the long-term effects of climate change on tropical leaf production remain poorly understood. 2. We analyzed 26 years of young leaf phenology field data from Kibale National Park, Uganda, focusing on […]
MedBayes-Lite: Bayesian Uncertainty Quantification for Safe Clinical Decision Support
arXiv:2511.16625v2 Announce Type: replace Abstract: We propose MedBayes-Lite, a lightweight Bayesian enhancement for transformer-based clinical language models that improves reliability through uncertainty-aware prediction. The framework operates without retraining, architectural modification, or additional trainable parameters, and integrates three components: Bayesian Embedding Calibration via Monte Carlo dropout, Uncertainty-Weighted Attention for reliability-aware token aggregation, and Confidence-Guided Decision Shaping […]
MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines
arXiv:2603.06679v2 Announce Type: replace Abstract: Video world models have shown immense promise for interactive simulation and entertainment, but current systems still struggle with two important aspects of interactivity: user control over the environment for reproducible, editable experiences, and shared inference where players hold influence over a common world. To address these limitations, we introduce an […]
Denoising the Future: Top-p Distributions for Moving Through Time
arXiv:2506.07578v4 Announce Type: replace-cross Abstract: Inference in dynamic probabilistic models is a complex task involving expensive operations. In particular, for Hidden Markov Models, the whole state space has to be enumerated for advancing in time. Even states with negligible probabilities are considered, resulting in computational inefficiency and possibly increased noise due to the propagation of […]
Multi-Level Knowledge Distillation and Dynamic Self-Supervised Learning for Continual Learning
arXiv:2508.12692v3 Announce Type: replace-cross Abstract: Class-incremental with repetition (CIR), where previously trained classes repeatedly introduced in future tasks, is a more realistic scenario than the traditional class incremental setup, which assumes that each task contains unseen classes. CIR assumes that we can easily access abundant unlabeled data from external sources, such as the Internet. Therefore, […]
Align Your Query: Representation Alignment for Multimodality Medical Object Detection
arXiv:2510.02789v2 Announce Type: replace-cross Abstract: Medical object detection suffers when a single detector is trained on mixed medical modalities (e.g., CXR, CT, MRI) due to heterogeneous statistics and disjoint representation spaces. To address this challenge, we turn to representation alignment, an approach that has proven effective for bringing features from different sources into a shared […]
Local Causal Discovery for Statistically Efficient Causal Inference
arXiv:2510.14582v2 Announce Type: replace-cross Abstract: Causal discovery methods can identify valid adjustment sets for causal effect estimation for a pair of target variables, even when the underlying causal graph is unknown. Global causal discovery methods focus on learning the whole causal graph and therefore enable the recovery of optimal adjustment sets, i.e., sets with the […]
Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
arXiv:2511.14565v2 Announce Type: replace-cross Abstract: Robots can adapt to user preferences by learning reward functions from demonstrations, but with limited data, reward models often overfit to spurious correlations and fail to generalize. This happens because demonstrations show robots how to do a task but not what matters for that task, causing the model to focus […]
LeLaR: The First In-Orbit Demonstration of an AI-Based Satellite Attitude Controller
arXiv:2512.19576v4 Announce Type: replace-cross Abstract: Attitude control is essential for many satellite missions. Classical controllers, however, are time-consuming to design and sensitive to model uncertainties and variations in operational boundary conditions. Deep Reinforcement Learning (DRL) offers a promising alternative by learning adaptive control strategies through autonomous interaction with a simulation environment. Overcoming the Sim2Real gap, […]
$V_0$: A Generalist Value Model for Any Policy at State Zero
arXiv:2602.03584v2 Announce Type: replace-cross Abstract: Policy gradient methods rely on a baseline to measure the relative advantage of an action, ensuring the model reinforces behaviors that outperform its current average capability. In the training of Large Language Models (LLMs) using Actor-Critic methods (e.g., PPO), this baseline is typically estimated by a Value Model (Critic) often […]
When Metrics Disagree: Automatic Similarity vs. LLM-as-a-Judge for Clinical Dialogue Evaluation
arXiv:2603.00314v2 Announce Type: replace-cross Abstract: As Large Language Models (LLMs) are increasingly integrated into healthcare to address complex inquiries, ensuring their reliability remains a critical challenge. Recent studies have highlighted that generic LLMs often struggle in clinical contexts, occasionally producing misleading guidance. To mitigate these risks, this research focuses on the domain-specific adaptation of textbfLlama-2-7B […]