arXiv:2601.21105v1 Announce Type: cross Abstract: Natural-language user profiles have recently attracted attention not only for improved interpretability, but also for their potential to make recommender systems more steerable. By enabling direct editing, natural-language profiles allow users to explicitly articulate preferences that may be difficult to infer from past behavior. However, it remains unclear whether current […]
On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs
arXiv:2509.25214v2 Announce Type: replace-cross Abstract: As increasingly large pre-trained models are released, deploying them on edge devices for privacy-preserving applications requires effective compression. Recent works combine quantization with the fine-tuning of high-precision LoRA adapters, which can substantially reduce model size while mitigating the accuracy loss from quantization. However, edge devices have inherently heterogeneous capabilities, while […]
Output-Space Search: Targeting LLM Generations in a Frozen Encoder-Defined Output Space
arXiv:2601.21169v1 Announce Type: cross Abstract: We introduce Output-Space Search (OS-Search), which turns LLM generation into endpoint search. An outer loop selects a target z* in a frozen encoder-defined 3D output space Z, and a retrieval-grounded policy trained with sequence-level RL generates outputs whose coordinates land near z* under standard autoregressive decoding. This enables parallel sweeps […]
Uncovering Hidden Correctness in LLM Causal Reasoning via Symbolic Verification
arXiv:2601.21210v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly being applied to tasks that involve causal reasoning. However, current benchmarks often rely on string matching or surface-level metrics that do not capture whether the output of a model is formally valid under the semantics of causal reasoning. To address this, we propose DoVerifier, […]
A Sheaf-Theoretic and Topological Perspective on Complex Network Modeling and Attention Mechanisms in Graph Neural Models
arXiv:2601.21207v1 Announce Type: cross Abstract: Combinatorial and topological structures, such as graphs, simplicial complexes, and cell complexes, form the foundation of geometric and topological deep learning (GDL and TDL) architectures. These models aggregate signals over such domains, integrate local features, and generate representations for diverse real-world applications. However, the distribution and diffusion behavior of GDL […]
What the flock knows that the birds do not: exploring the emergence of joint agency in multi-agent active inference
arXiv:2511.10835v2 Announce Type: replace-cross Abstract: Collective behavior pervades biological systems, from flocks of birds to neural assemblies and human societies. Yet, how such collectives acquire functional properties — such as joint agency or knowledge — that transcend those of their individual components remains an open question. Here, we combine active inference and information-theoretic analyses to […]
Conditional Generative Framework with Peak-Aware Attention for Robust Chemical Detection under Interferences
arXiv:2601.21246v1 Announce Type: cross Abstract: Gas chromatography-mass spectrometry (GC-MS) is a widely used analytical method for chemical substance detection, but measurement reliability tends to deteriorate in the presence of interfering substances. In particular, interfering substances cause nonspecific peaks, residence time shifts, and increased background noise, resulting in reduced sensitivity and false alarms. To overcome these […]
PILD: Physics-Informed Learning via Diffusion
arXiv:2601.21284v1 Announce Type: cross Abstract: Diffusion models have emerged as powerful generative tools for modeling complex data distributions, yet their purely data-driven nature limits applicability in practical engineering and scientific problems where physical laws need to be followed. This paper proposes Physics-Informed Learning via Diffusion (PILD), a framework that unifies diffusion modeling and first-principles physical […]
OrthoInsight: Rib Fracture Diagnosis and Report Generation Based on Multi-Modal Large Models
arXiv:2507.13993v4 Announce Type: replace-cross Abstract: The growing volume of medical imaging data has increased the need for automated diagnostic tools, especially for musculoskeletal injuries like rib fractures, commonly detected via CT scans. Manual interpretation is time-consuming and error-prone. We propose OrthoInsight, a multi-modal deep learning framework for rib fracture diagnosis and report generation. It integrates […]
Self-Improving Pretraining: using post-trained models to pretrain better models
arXiv:2601.21343v1 Announce Type: cross Abstract: Ensuring safety, factuality and overall quality in the generations of large language models is a critical challenge, especially as these models are increasingly deployed in real-world applications. The prevailing approach to addressing these issues involves collecting expensive, carefully curated datasets and applying multiple stages of fine-tuning and alignment. However, even […]
How does information access affect LLM monitors’ ability to detect sabotage?
arXiv:2601.21112v1 Announce Type: new Abstract: Frontier language model agents can exhibit misaligned behaviors, including deception, exploiting reward hacks, and pursuing hidden objectives. To control potentially misaligned agents, we can use LLMs themselves to monitor for misbehavior. In this paper, we study how information access affects LLM monitor performance. While one might expect that monitors perform […]
BrainFuse: a unified infrastructure integrating realistic biological modeling and core AI methodology
arXiv:2601.21407v1 Announce Type: cross Abstract: Neuroscience and artificial intelligence represent distinct yet complementary pathways to general intelligence. However, amid the ongoing boom in AI research and applications, the translational synergy between these two fields has grown increasingly elusive-hampered by a widening infrastructural incompatibility: modern AI frameworks lack native support for biophysical realism, while neural simulation […]