arXiv:2603.17583v1 Announce Type: cross Abstract: Editing a 3D indoor scene from natural language is conceptually straightforward but technically challenging. Existing open-vocabulary systems often regenerate large portions of a scene or rely on image-space edits that disrupt spatial structure, resulting in unintended global changes or physically inconsistent layouts. These limitations stem from treating editing primarily as […]
Scalable Energy-Based Models via Adversarial Training: Unifying Discrimination and Generation
arXiv:2510.13872v4 Announce Type: replace-cross Abstract: Simultaneously achieving robust classification and high-fidelity generative modeling within a single framework presents a significant challenge. Hybrid approaches, such as Joint Energy-Based Models (JEM), interpret classifiers as EBMs but are often limited by the instability and poor sample quality inherent in training based on Stochastic Gradient Langevin Dynamics (SGLD). We […]
Multimodal Emotion Recognition via Bi-directional Cross-Attention and Temporal Modeling
arXiv:2603.11971v2 Announce Type: replace-cross Abstract: Expression recognition in in-the-wild video data remains challenging due to substantial variations in facial appearance, background conditions, audio noise, and the inherently dynamic nature of human affect. Relying on a single modality, such as facial expressions or speech, is often insufficient for capturing these complex emotional cues. To address this […]
APEX-SWE
arXiv:2601.08806v2 Announce Type: replace-cross Abstract: We introduce the AI Productivity Index for Software Engineering (APEX-SWE), a benchmark for assessing whether frontier AI models can execute economically valuable software engineering work. Unlike existing evaluations that focus on narrow, well-defined tasks, APEX-SWE assesses two novel task types that reflect real-world software engineering: (1) Integration tasks (n=100), which […]
Identifying Latent Actions and Dynamics from Offline Data via Demonstrator Diversity
arXiv:2603.17577v1 Announce Type: cross Abstract: Can latent actions and environment dynamics be recovered from offline trajectories when actions are never observed? We study this question in a setting where trajectories are action-free but tagged with demonstrator identity. We assume that each demonstrator follows a distinct policy, while the environment dynamics are shared across demonstrators and […]
CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions
arXiv:2510.14959v5 Announce Type: replace-cross Abstract: Reinforcement learning (RL), while powerful and expressive, can often prioritize performance at the expense of safety. Yet safety violations can lead to catastrophic outcomes in real-world deployments. Control Barrier Functions (CBFs) offer a principled method to enforce dynamic safety — traditionally deployed online via safety filters. While the result is […]
LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis
arXiv:2603.05904v2 Announce Type: replace-cross Abstract: GPU design space exploration (DSE) for modern AI workloads, such as Large-Language Model (LLM) inference, is challenging because of GPUs’ vast, multi-modal design spaces, high simulation costs, and complex design optimization objectives (e.g. performance, power and area trade-offs). Existing automated DSE methods are often prohibitively expensive, either requiring an excessive […]
GIFT: Reconciling Post-Training Objectives via Finite-Temperature Gibbs Initialization
arXiv:2601.09233v2 Announce Type: replace-cross Abstract: The prevailing post-training paradigm for Large Reasoning Models (LRMs) – Supervised Fine-Tuning (SFT) followed by Reinforcement Learning (RL) – suffers from an intrinsic optimization mismatch: the rigid supervision inherent in SFT induces distributional collapse, thereby exhausting the exploration space necessary for subsequent RL. In this paper, we reformulate SFT to […]
Unsupervised Symbolic Anomaly Detection
arXiv:2603.17575v1 Announce Type: cross Abstract: We propose SYRAN, an unsupervised anomaly detection method based on symbolic regression. Instead of encoding normal patterns in an opaque, high-dimensional model, our method learns an ensemble of human-readable equations that describe symbolic invariants: functions that are approximately constant on normal data. Deviations from these invariants yield anomaly scores, so […]
JAWS: Enhancing Long-term Rollout of Neural PDE Solvers via Spatially-Adaptive Jacobian Regularization
arXiv:2603.05538v2 Announce Type: replace-cross Abstract: Data-driven surrogate models can significantly accelerate the simulation of continuous dynamical systems, yet the step-wise accumulation of errors during autoregressive time-stepping often leads to spectral blow-up and unphysical divergence. Existing global regularization techniques can enforce contractive dynamics but uniformly damp high-frequency features, causing over-smoothing; meanwhile, long-horizon trajectory optimization methods are […]
CogGen: Cognitive-Load-Informed Fully Unsupervised Deep Generative Modeling for Compressively Sampled MRI Reconstruction
arXiv:2603.04438v2 Announce Type: replace-cross Abstract: Fully unsupervised deep generative modeling (FU-DGM) is promising for compressively sampled MRI (CS-MRI) when training data or compute are limited. Classical FU-DGMs such as DIP and INR rely on architectural priors, but the ill-conditioned inverse problem often demands many iterations and easily overfits measurement noise. We propose CogGen, a cognitive-load-informed […]
Resource Consumption Threats in Large Language Models
arXiv:2603.16068v2 Announce Type: replace-cross Abstract: Given limited and costly computational infrastructure, resource efficiency is a key requirement for large language models (LLMs). Efficient LLMs increase service capacity for providers and reduce latency and API costs for users. Recent resource consumption threats induce excessive generation, degrading model efficiency and harming both service availability and economic sustainability. […]