arXiv:2601.20642v1 Announce Type: cross Abstract: Diffusion-based image generative models produce high-fidelity images through iterative denoising but remain vulnerable to memorization, where they unintentionally reproduce exact copies or parts of training images. Recent memorization detection methods are primarily based on the norm of score difference as indicators of memorization. We prove that such norm-based metrics are […]
VLM-CAD: VLM-Optimized Collaborative Agent Design Workflow for Analog Circuit Sizing
arXiv:2601.07315v3 Announce Type: replace-cross Abstract: Analog mixed-signal circuit sizing involves complex trade-offs within high-dimensional design spaces. Existing automatic analog circuit sizing approaches rely solely on netlists, ignoring the circuit schematic, which hinders the cognitive link between the schematic and its performance. Furthermore, the black-box nature of machine learning methods and hallucination risks in large language […]
Learning Contextual Runtime Monitors for Safe AI-Based Autonomy
arXiv:2601.20666v1 Announce Type: cross Abstract: We introduce a novel framework for learning context-aware runtime monitors for AI-based control ensembles. Machine-learning (ML) controllers are increasingly deployed in (autonomous) cyber-physical systems because of their ability to solve complex decision-making tasks. However, their accuracy can degrade sharply in unfamiliar environments, creating significant safety concerns. Traditional ensemble methods aim […]
Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum
arXiv:2601.14172v2 Announce Type: replace-cross Abstract: We study sentence-level detection of the 19 human values in the refined Schwartz continuum in about 74k English sentences from news and political manifestos (ValueEval’24 corpus). Each sentence is annotated with value presence, yielding a binary moral-presence label and a 19-way multi-label task under severe class imbalance. First, we show […]
HOMURA: Taming the Sand-Glass for Time-Constrained LLM Translation via Reinforcement Learning
arXiv:2601.10187v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have achieved remarkable strides in multilingual translation but are hindered by a systemic cross-lingual verbosity bias, rendering them unsuitable for strict time-constrained tasks like subtitling and dubbing. Current prompt-engineering approaches struggle to resolve this conflict between semantic fidelity and rigid temporal feasibility. To bridge this gap, […]
RPO-RAG: Aligning Small LLMs with Relation-aware Preference Optimization for Knowledge Graph Question Answering
arXiv:2601.19225v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have recently demonstrated remarkable reasoning abilities, yet hallucinate on knowledge-intensive tasks. Retrieval-augmented generation (RAG) mitigates this issue by grounding answers in external sources, e.g., knowledge graphs (KGs). However, existing KG-based RAG approaches rely on semantics-unaware path sampling and are weakly aligned with KG reasoning objectives, which […]
Sparsity-Aware Low-Rank Representation for Efficient Fine-Tuning of Large Language Models
arXiv:2601.16991v2 Announce Type: replace-cross Abstract: Adapting large pre-trained language models to downstream tasks often entails fine-tuning millions of parameters or deploying costly dense weight updates, which hinders their use in resource-constrained environments. Low-rank Adaptation (LoRA) reduces trainable parameters by factorizing weight updates, yet the underlying dense weights still impose high storage and computation costs. Magnitude-based […]
R^3: Replay, Reflection, and Ranking Rewards for LLM Reinforcement Learning
arXiv:2601.19620v2 Announce Type: replace-cross Abstract: Large reasoning models (LRMs) aim to solve diverse and complex problems through structured reasoning. Recent advances in group-based policy optimization methods have shown promise in enabling stable advantage estimation without reliance on process-level annotations. However, these methods rely on advantage gaps induced by high-quality samples within the same batch, which […]
LEMON: How Well Do MLLMs Perform Temporal Multimodal Understanding on Instructional Videos?
arXiv:2601.20705v1 Announce Type: cross Abstract: Recent multimodal large language models (MLLMs) have shown remarkable progress across vision, audio, and language tasks, yet their performance on long-form, knowledge-intensive, and temporally structured educational content remains largely unexplored. To bridge this gap, we introduce LEMON, a Lecture-based Evaluation benchmark for MultimOdal uNderstanding, focusing on STEM lecture videos that […]
Beyond GEMM-Centric NPUs: Enabling Efficient Diffusion LLM Sampling
arXiv:2601.20706v1 Announce Type: cross Abstract: Diffusion Large Language Models (dLLMs) introduce iterative denoising to enable parallel token generation, but their sampling phase displays fundamentally different characteristics compared to GEMM-centric transformer layers. Profiling on modern GPUs reveals that sampling can account for up to 70% of total model inference latency-primarily due to substantial memory loads and […]
LVLMs and Humans Ground Differently in Referential Communication
arXiv:2601.19792v2 Announce Type: replace-cross Abstract: For generative AI agents to partner effectively with human users, the ability to accurately predict human intent is critical. But this ability to collaborate remains limited by a critical deficit: an inability to model common ground. Here, we present a referential communication experiment with a factorial design involving director-matcher pairs […]
NeuroAI and Beyond
arXiv:2601.19955v1 Announce Type: new Abstract: Neuroscience and Artificial Intelligence (AI) have made significant progress in the past few years but have only been loosely inter-connected. Based on a workshop held in August 2025, we identify current and future areas of synergism between these two fields. We focus on the subareas of embodiment, language and communication, […]