arXiv:2604.22939v1 Announce Type: cross Abstract: While the next-token prediction (NTP) paradigm enables large language models (LLMs) to express their intrinsic knowledge, its sequential nature constrains performance on specialized, non-generative tasks. We attribute this performance bottleneck to the LLMs’ knowledge expression mechanism, rather than to deficiencies in knowledge acquisition. To address this, we propose Self-Knowledge Re-expression […]
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation
arXiv:2604.24401v1 Announce Type: cross Abstract: Large Audio-Language Models show consistent performance gains across speech and audio benchmarks, yet high scores may not reflect true auditory perception. If a model can answer questions without processing the acoustic signal, the benchmark fails as a measure of auditory understanding. We present a diagnostic framework using two axes: text […]
CheXmix: Unified Generative Pretraining for Vision Language Models in Medical Imaging
arXiv:2604.22989v1 Announce Type: cross Abstract: Recent medical multimodal foundation models are built as multimodal LLMs (MLLMs) by connecting a CLIP-pretrained vision encoder to an LLM using LLaVA-style finetuning. This two-stage, decoupled approach introduces a projection layer that can distort visual features. This is especially concerning in medical imaging where subtle cues are essential for accurate […]
Citation-Driven Multi-View Training for Patent Embeddings: QaECTER and Sophia-Bench
arXiv:2604.22897v1 Announce Type: cross Abstract: Patent retrieval underpins critical decisions in innovation, examination, and IP strategy, yet progress has been hampered by the absence of benchmarks that reflect the diversity of real world search scenarios. We address this gap with two contributions. First, we introduce Sophiabench, a large-scale patent retrieval benchmark comprising 10,000 queries and […]
CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning
arXiv:2604.23270v1 Announce Type: new Abstract: Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on long, multi-step problems, leading to inconsistent answers for unchanged task. Most prior work focuses on improving the forward reasoning chain within […]
Mean-Field and Pairwise Approaches for the SIRI Model on Poisson Networks
arXiv:2604.23243v1 Announce Type: new Abstract: Compartmental epidemic models, grounded in mass-action kinetics, often assume homogeneous mixing. Although this neglects network structure, recent results show that for Poisson random graphs, the classical SIR model, especially the susceptible decay curve, matches the susceptible decay dynamics of its network counterpart. Motivated by this, we investigate whether the extended […]
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
arXiv:2604.23018v1 Announce Type: cross Abstract: Web-scale 3D asset collections are abundant, but rarely deployment-ready. Assets ship with arbitrary metric scale, incorrect pivots and forward axes, brittle geometry, and textures that do not support relighting, which limits their utility for embodied AI, robotics simulation, game development, and AR/VR. We present AmaraSpatial-10K, a dataset of over 10,000 […]
Self Knowledge Re-expression: A Fully Local Method for Adapting LLMs to Tasks Using Intrinsic Knowledge
arXiv:2604.22939v1 Announce Type: cross Abstract: While the next-token prediction (NTP) paradigm enables large language models (LLMs) to express their intrinsic knowledge, its sequential nature constrains performance on specialized, non-generative tasks. We attribute this performance bottleneck to the LLMs’ knowledge expression mechanism, rather than to deficiencies in knowledge acquisition. To address this, we propose Self-Knowledge Re-expression […]
Skill Retrieval Augmentation for Agentic AI
arXiv:2604.24594v1 Announce Type: cross Abstract: As large language models (LLMs) evolve into agentic problem solvers, they increasingly rely on external, reusable skills to handle tasks beyond their native parametric capabilities. In existing agent systems, the dominant strategy for incorporating skills is to explicitly enumerate available skills within the context window. However, this strategy fails to […]
All That Glitters Is Not Audio: Rethinking Text Priors and Audio Reliance in Audio-Language Evaluation
arXiv:2604.24401v1 Announce Type: cross Abstract: Large Audio-Language Models show consistent performance gains across speech and audio benchmarks, yet high scores may not reflect true auditory perception. If a model can answer questions without processing the acoustic signal, the benchmark fails as a measure of auditory understanding. We present a diagnostic framework using two axes: text […]
C-MORAL: Controllable Multi-Objective Molecular Optimization with Reinforcement Alignment for LLMs
arXiv:2604.23061v1 Announce Type: cross Abstract: Large language models (LLMs) show promise for molecular optimization, but aligning them with selective and competing drug-design constraints remains challenging. We propose C-Moral, a reinforcement learning post-training framework for controllable multi-objective molecular optimization. C-Moral combines group-based relative optimization, property score alignment for heterogeneous objectives, and continuous non-linear reward aggregation to […]
CheXmix: Unified Generative Pretraining for Vision Language Models in Medical Imaging
arXiv:2604.22989v1 Announce Type: cross Abstract: Recent medical multimodal foundation models are built as multimodal LLMs (MLLMs) by connecting a CLIP-pretrained vision encoder to an LLM using LLaVA-style finetuning. This two-stage, decoupled approach introduces a projection layer that can distort visual features. This is especially concerning in medical imaging where subtle cues are essential for accurate […]