arXiv:2511.01831v3 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) struggle with continual learning, often suffering from catastrophic forgetting when adapting to sequential tasks. We introduce a routing-based architecture that integrates new capabilities while robustly preserving foundational knowledge. While Multi-Task Learning (MTL) offers a theoretical performance upper bound, it incurs a linearly scaling computational overhead […]
Proximity Measure of Information Object Features for Solving the Problem of Their Identification in Information Systems
arXiv:2604.04939v1 Announce Type: new Abstract: The paper considers a new quantitative-qualitative proximity measure for the features of information objects, where data enters a common information resource from several sources independently. The goal is to determine the possibility of their relation to the same physical object (observation object). The proposed measure accounts for the possibility of […]
HybridKV: Hybrid KV Cache Compression for Efficient Multimodal Large Language Model Inference
arXiv:2604.05887v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have advanced unified reasoning over text, images, and videos, but their inference is hindered by the rapid growth of key-value (KV) caches. Each visual input expands into thousands of tokens, causing caches to scale linearly with context length and remain resident in GPU memory throughout […]
Beyond BMI: Smartphone Body Composition Phenotyping for Cardiometabolic Risk Assessment
arXiv:2603.27017v2 Announce Type: replace Abstract: Body Mass Index (BMI) is a widely accessible but imprecise proxy of cardiometabolic health. While assessing true body composition is superior, gold-standard methods like Dual-Energy X-ray Absorptiometry (DXA) are not scalable. We address this gap by developing and validating “PhotoScan,” a method to estimate body composition from smartphone imagery. We […]
MMORF: A Multi-agent Framework for Designing Multi-objective Retrosynthesis Planning Systems
arXiv:2604.05075v1 Announce Type: new Abstract: Multi-objective retrosynthesis planning is a critical chemistry task requiring dynamic balancing of quality, safety, and cost objectives. Language model-based multi-agent systems (MAS) offer a promising approach for this task: leveraging interactions of specialized agents to incorporate multiple objectives into retrosynthesis planning. We present MMORF, a framework for constructing MAS for […]
VarDrop: Enhancing Training Efficiency by Reducing Variate Redundancy in Periodic Time Series Forecasting
arXiv:2501.14183v3 Announce Type: replace-cross Abstract: Variate tokenization, which independently embeds each variate as separate tokens, has achieved remarkable improvements in multivariate time series forecasting. However, employing self-attention with variate tokens incurs a quadratic computational cost with respect to the number of variates, thus limiting its training efficiency for large-scale applications. To address this issue, we […]
Context-Value-Action Architecture for Value-Driven Large Language Model Agents
arXiv:2604.05939v1 Announce Type: new Abstract: Large Language Models (LLMs) have shown promise in simulating human behavior, yet existing agents often exhibit behavioral rigidity, a flaw frequently masked by the self-referential bias of current “LLM-as-a-judge” evaluations. By evaluating against empirical ground truth, we reveal a counter-intuitive phenomenon: increasing the intensity of prompt-driven reasoning does not enhance […]
ActivityEditor: Learning to Synthesize Physically Valid Human Mobility
arXiv:2604.05529v1 Announce Type: new Abstract: Human mobility modeling is indispensable for diverse urban applications. However, existing data-driven methods often suffer from data scarcity, limiting their applicability in regions where historical trajectories are unavailable or restricted. To bridge this gap, we propose textbfActivityEditor, a novel dual-LLM-agent framework designed for zero-shot cross-regional trajectory generation. Our framework decomposes […]
A canonical generalization of OBDD
arXiv:2604.05537v1 Announce Type: new Abstract: We introduce Tree Decision Diagrams (TDD) as a model for Boolean functions that generalizes OBDD. They can be seen as a restriction of structured d-DNNF; that is, d-DNNF that respect a vtree $T$. We show that TDDs enjoy the same tractability properties as OBDD, such as model counting, enumeration, conditioning, […]
Transcriptomic Models for Immunotherapy Response Prediction Show Limited Cross-cohort Generalisability
arXiv:2604.05478v1 Announce Type: new Abstract: Immune checkpoint inhibitors (ICIs) have transformed cancer therapy; yet substantial proportion of patients exhibit intrinsic or acquired resistance, making accurate pre-treatment response prediction a critical unmet need. Transcriptomics-based biomarkers derived from bulk and single-cell RNA sequencing (scRNA-seq) offer a promising avenue for capturing tumour-immune interactions, yet the cross-cohort generalisability of […]
Thinking Diffusion: Penalize and Guide Visual-Grounded Reasoning in Diffusion Multimodal Language Models
arXiv:2604.05497v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) are emerging as promising alternatives to autoregressive (AR) LLMs. Recently, this paradigm has been extended to multimodal tasks, leading to the development of diffusion multimodal large language models (dMLLMs). These models are expected to retain the reasoning capabilities of LLMs while enabling faster inference through […]
PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection
arXiv:2604.05424v1 Announce Type: new Abstract: PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection Siyuan Cheng, Bozhong Tian, Yanchao Hao, Zheng Wei Published: 06 Apr 2026, Last Modified: 06 Apr 2026 ACL 2026 Findings Conference, Area Chairs, Reviewers, Publication Chairs, Authors Revisions BibTeX CC BY 4.0 Keywords: Efficient/Low-Resource Methods for NLP, Generation, Question Answering Abstract: The […]