arXiv:2603.24218v1 Announce Type: cross Abstract: Large Language Models (LLMs) enhanced with Retrieval-Augmented Generation (RAG) have achieved substantial improvements in accuracy by grounding their responses in external documents that are relevant to the user’s query. However, relatively little work has investigated the impact of RAG in terms of fairness. Particularly, it is not yet known if […]
PromptLoop: Plug-and-Play Prompt Refinement via Latent Feedback for Diffusion Model Alignment
arXiv:2510.00430v2 Announce Type: replace-cross Abstract: Despite recent progress, reinforcement learning (RL)-based fine-tuning of diffusion models often struggles with generalization, composability, and robustness against reward hacking. Recent studies have explored prompt refinement as a modular alternative, but most adopt a feed-forward approach that applies a single refined prompt throughout the entire sampling trajectory, thereby failing to […]
FedPBS: Proximal-Balanced Scaling Federated Learning Model for Robust Personalized Training for Non-IID Data
arXiv:2603.13909v2 Announce Type: replace-cross Abstract: Federated learning (FL) enables a set of distributed clients to jointly train machine learning models while preserving their local data privacy, making it attractive for applications in healthcare, finance, mobility, and smart-city systems. However, FL faces several challenges, including statistical heterogeneity and uneven client participation, which can degrade convergence and […]
Divide, then Ground: Adapting Frame Selection to Query Types for Long-Form Video Understanding
arXiv:2512.04000v2 Announce Type: replace-cross Abstract: The application of Large Multimodal Models (LMMs) to long-form video understanding is constrained by limited context lengths and the computationally prohibitive cost of processing dense video tokens. Consequently, recent research has focused on query-aware frame selection, methods that often incur significant computational overhead. This paper challenges the assumption that such […]
Where Do Your Citations Come From? Citation-Constellation: A Free, Open-Source, No-Code, and Auditable Tool for Citation Network Decomposition with Complementary BARON and HEROCON Scores
arXiv:2603.24216v1 Announce Type: cross Abstract: Standard citation metrics treat all citations as equal, obscuring the social and structural pathways through which scholarly influence propagates. I introduce Citation-Constellation, a freely available no-code tool for citation network analysis with two complementary bibliometric scores that decompose a researcher’s citation profile by network proximity between citing and cited authors. […]
Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries
arXiv:2510.14751v2 Announce Type: replace-cross Abstract: Next-token prediction (NTP) has driven the success of large language models (LLMs), but it struggles with long-horizon reasoning, planning, and creative writing, with these limitations largely attributed to teacher-forced training. Multi-token prediction (MTP) partially mitigates these issues by predicting several future tokens at once, but it mostly captures short-range dependencies […]
Geometry-Guided Camera Motion Understanding in VideoLLMs
arXiv:2603.13119v2 Announce Type: replace-cross Abstract: Camera motion is a fundamental geometric signal that shapes visual perception and cinematic style, yet current video-capable vision-language models (VideoLLMs) rarely represent it explicitly and often fail on fine-grained motion primitives. We address this gap with a framework of $textbfbenchmarking$, $textbfdiagnosis$, and $textbfinjection$. We curate $textbfCameraMotionDataset$, a large-scale synthetic dataset […]
Understanding Pure Textual Reasoning for Blind Image Quality Assessment
arXiv:2601.02441v2 Announce Type: replace-cross Abstract: Textual reasoning has recently been widely adopted in Blind Image Quality Assessment (BIQA). However, it remains unclear how textual information contributes to quality prediction and to what extent text can represent the score-related image contents. This work addresses these questions from an information-flow perspective by comparing existing BIQA models with […]
Uncovering Memorization in Timeseries Imputation models: LBRM Membership Inference and its link to attribute Leakage
arXiv:2603.24213v1 Announce Type: cross Abstract: Deep learning models for time series imputation are now essential in fields such as healthcare, the Internet of Things (IoT), and finance. However, their deployment raises critical privacy concerns. Beyond the well-known issue of unintended memorization, which has been extensively studied in generative models, we demonstrate that time series models […]
Exploring Collatz Dynamics with Human-LLM Collaboration
arXiv:2603.11066v3 Announce Type: replace-cross Abstract: We develop a structural and quantitative framework for analyzing the Collatz map through modular dynamics, valuation statistics, and combinatorial decomposition of trajectories into bursts and gaps. We establish several exact and asymptotic results, including an affine scrambling structure for odd-to-odd dynamics, structural decay of residue information, and a quantitative bound […]
Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies
arXiv:2603.12510v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models have significant potential to enable general-purpose robotic systems for a range of vision-language tasks. However, the performance of VLA-based robots is highly sensitive to the precise wording of language instructions, and it remains difficult to predict when such robots will fail. To improve the robustness of VLAs […]
LLM-Powered Workflow Optimization for Multidisciplinary Software Development: An Automotive Industry Case Study
arXiv:2603.21439v3 Announce Type: replace-cross Abstract: Multidisciplinary Software Development (MSD) requires domain experts and developers to collaborate across incompatible formalisms and separate artifact sets. Today, even with AI coding assistants like GitHub Copilot, this process remains inefficient; individual coding tasks are semi-automated, but the workflow connecting domain knowledge to implementation is not. Developers and experts still […]