arXiv:2604.02034v1 Announce Type: new Abstract: Insurance application processes often rely on lengthy and standardized questionnaires that struggle to capture individual differences. Moreover, insurers must blindly trust users’ responses, increasing the chances of fraud. The ARQuest framework introduces a new approach to underwriting by using Large Language Models (LLMs) and alternative data sources to create personalized […]
Trustworthy AI-Driven Dynamic Hybrid RIS: Joint Optimization and Reward Poisoning-Resilient Control in Cognitive MISO Networks
arXiv:2604.01238v1 Announce Type: cross Abstract: Cognitive radio networks (CRNs) are a key mechanism for alleviating spectrum scarcity by enabling secondary users (SUs) to opportunistically access licensed frequency bands without harmful interference to primary users (PUs). To address unreliable direct SU links and energy constraints common in next-generation wireless networks, this work introduces an adaptive, energy-aware […]
Can Heterogeneous Language Models Be Fused?
arXiv:2604.01674v1 Announce Type: new Abstract: Model merging aims to integrate multiple expert models into a single model that inherits their complementary strengths without incurring the inference-time cost of ensembling. Recent progress has shown that merging can be highly effective when all source models are emphhomogeneous, i.e., derived from the same pretrained backbone and therefore share […]
The AnIML Ontology: Enabling Semantic Interoperability for Large-Scale Experimental Data in Interconnected Scientific Labs
arXiv:2604.01728v1 Announce Type: new Abstract: Achieving semantic interoperability across heterogeneous experimental data systems remains a major barrier to data-driven scientific discovery. The Analytical Information Markup Language (AnIML), a flexible XML-based standard for analytical chemistry and biology, is increasingly used in industrial R&D labs for managing and exchanging experimental data. However, the expressivity of the XML […]
Ontology-Aware Design Patterns for Clinical AI Systems: Translating Reification Theory into Software Architecture
arXiv:2604.01661v1 Announce Type: new Abstract: Clinical AI systems routinely train on health data structurally distorted by documentation workflows, billing incentives, and terminology fragmentation. Prior work has characterised the mechanisms of this distortion: the three-forces model of documentary enactment, the reification feedback loop through which AI may amplify coding artefacts, and terminology governance failures that allow […]
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook
arXiv:2604.02029v1 Announce Type: new Abstract: Latent space is rapidly emerging as a native substrate for language-based models. While modern systems are still commonly understood through explicit token-level generation, an increasing body of work shows that many critical internal processes are more naturally carried out in continuous latent space than in human-readable verbal traces. This shift […]
CRaFT: Circuit-Guided Refusal Feature Selection via Cross-Layer Transcoders
arXiv:2604.01604v1 Announce Type: new Abstract: As safety concerns around large language models (LLMs) grow, understanding the internal mechanisms underlying refusal behavior has become increasingly important. Recent work has studied this behavior by identifying internal features associated with refusal and manipulating them to induce compliance with harmful requests. However, existing refusal feature selection methods rely on […]
UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems
arXiv:2604.00590v2 Announce Type: replace-cross Abstract: In recent years, the scaling laws of recommendation models have attracted increasing attention, which govern the relationship between performance and parameters/FLOPs of recommenders. Currently, there are three mainstream architectures for achieving scaling in recommendation models, namely attention-based, TokenMixer-based, and factorization-machine-based methods, which exhibit fundamental differences in both design philosophy and […]
Abnormal Head Movements in Neurological Conditions: A Knowledge-Based Dataset with Application to Cervical Dystonia
arXiv:2604.01962v1 Announce Type: new Abstract: Abnormal head movements (AHMs) manifest across a broad spectrum of neurological disorders; however, the absence of a multi-condition resource integrating kinematic measurements, clinical severity scores, and patient demographics constitutes a persistent barrier to the development of AI-driven diagnostic tools. To address this gap, this study introduces NeuroPose-AHM, a knowledge-based dataset […]
ImplicitBBQ: Benchmarking Implicit Bias in Large Language Models through Characteristic Based Cues
arXiv:2604.01925v1 Announce Type: cross Abstract: Large Language Models increasingly suppress biased outputs when demographic identity is stated explicitly, yet may still exhibit implicit biases when identity is conveyed indirectly. Existing benchmarks use name based proxies to detect implicit biases, which carry weak associations with many social demographics and cannot extend to dimensions like age or […]
Woosh: A Sound Effects Foundation Model
arXiv:2604.01929v1 Announce Type: cross Abstract: The audio research community depends on open generative models as foundational tools for building novel approaches and establishing baselines. In this report, we present Woosh, Sony AI’s publicly released sound effect foundation model, detailing its architecture, training process, and an evaluation against other popular open models. Being optimized for sound […]
ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget
arXiv:2604.01195v2 Announce Type: replace-cross Abstract: Search agents, which integrate language models (LMs) with web search, are becoming crucial for answering complex user queries. Constructing training datasets for deep research tasks, involving multi-step retrieval and reasoning, remains challenging due to expensive human annotation, or cumbersome prerequisites. In this work, we introduce ORBIT, a training dataset with […]