arXiv:2512.10229v3 Announce Type: replace-cross Abstract: Time series forecasting is a critical task for artificial intelligence with numerous real-world applications. Traditional approaches primarily rely on historical time series data to predict the future values. However, in practical scenarios, this is often insufficient for accurate predictions due to the limited information available. To address this challenge, multimodal […]
UbiQVision: Quantifying Uncertainty in XAI for Image Recognition
arXiv:2512.20288v1 Announce Type: cross Abstract: Recent advances in deep learning have led to its widespread adoption across diverse domains, including medical imaging. This progress is driven by increasingly sophisticated model architectures, such as ResNets, Vision Transformers, and Hybrid Convolutional Neural Networks, that offer enhanced performance at the cost of greater complexity. This complexity often compromises […]
Continuous Vision-Language-Action Co-Learning with Semantic-Physical Alignment for Behavioral Cloning
arXiv:2511.14396v5 Announce Type: replace-cross Abstract: Language-conditioned manipulation facilitates human-robot interaction via behavioral cloning (BC), which learns control policies from human demonstrations and serves as a cornerstone of embodied AI. Overcoming compounding errors in sequential action decisions remains a central challenge to improving BC performance. Existing approaches mitigate compounding errors through data augmentation, expressive representation, or […]
Corpus of Cross-lingual Dialogues with Minutes and Detection of Misunderstandings
arXiv:2512.20204v1 Announce Type: cross Abstract: Speech processing and translation technology have the potential to facilitate meetings of individuals who do not share any common language. To evaluate automatic systems for such a task, a versatile and realistic evaluation corpus is needed. Therefore, we create and present a corpus of cross-lingual dialogues between individuals without a […]
LLM-based Behaviour Driven Development for Hardware Design
arXiv:2512.17814v2 Announce Type: replace-cross Abstract: Test and verification are essential activities in hardware and system design, but their complexity grows significantly with increasing system sizes. While Behavior Driven Development (BDD) has proven effective in software engineering, it is not yet well established in hardware design, and its practical use remains limited. One contributing factor is […]
Fast LLM Post-training via Decoupled and Fastest-of-N Speculation
arXiv:2511.16193v3 Announce Type: replace-cross Abstract: Rollout dominates the training time in large language model (LLM) post-training, where the trained model is used to generate tokens given a batch of prompts. This work, SpecActor, achieves fast rollout with speculative decoding that deploys a fast draft path to accelerate the unparallelizable generation, while the correctness is guaranteed […]
Performative Policy Gradient: Optimality in Performative Reinforcement Learning
arXiv:2512.20576v1 Announce Type: cross Abstract: Post-deployment machine learning algorithms often influence the environments they act in, and thus shift the underlying dynamics that the standard reinforcement learning (RL) methods ignore. While designing optimal algorithms in this performative setting has recently been studied in supervised learning, the RL counterpart remains under-explored. In this paper, we prove […]
Asynchronous Fast-Slow Vision-Language-Action Policies for Whole-Body Robotic Manipulation
arXiv:2512.20188v1 Announce Type: cross Abstract: Most Vision-Language-Action (VLA) systems integrate a Vision-Language Model (VLM) for semantic reasoning with an action expert generating continuous action signals, yet both typically run at a single unified frequency. As a result, policy performance is constrained by the low inference speed of large VLMs. This mandatory synchronous execution severely limits […]
Multi-Agent Intelligence for Multidisciplinary Decision-Making in Gastrointestinal Oncology
arXiv:2512.08674v2 Announce Type: replace Abstract: Multimodal clinical reasoning in the field of gastrointestinal (GI) oncology necessitates the integrated interpretation of endoscopic imagery, radiological data, and biochemical markers. Despite the evident potential exhibited by Multimodal Large Language Models (MLLMs), they frequently encounter challenges such as context dilution and hallucination when confronted with intricate, heterogeneous medical histories. […]
Compute-in-Memory Implementation of State Space Models for Event Sequence Processing
arXiv:2511.13912v2 Announce Type: replace-cross Abstract: State space models (SSMs) have recently emerged as a powerful framework for long sequence processing, outperforming traditional methods on diverse benchmarks. Fundamentally, SSMs can generalize both recurrent and convolutional networks and have been shown to even capture key functions of biological systems. Here we report an approach to implement SSMs […]