Self-Routing: Parameter-Free Expert Routing from Hidden States

arXiv:2604.00421v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) layers increase model capacity by activating only a small subset of experts per token, and typically rely on a learned router to map hidden states to expert assignments. In this work, we ask whether a dedicated learned router is strictly necessary in the MoE settings we study. We […]

Degrees, Levels, and Profiles of Contextuality

arXiv:2603.26692v2 Announce Type: replace-cross Abstract: We introduce a new notion, that of a contextuality profile of a system of random variables. Rather than characterizing a system’s contextuality by a single number, its overall degree of contextuality, we show how it can be characterized by a curve relating degree of contextuality to level at which the […]

Streaming Model Cascades for Semantic SQL

arXiv:2604.00660v1 Announce Type: cross Abstract: Modern data warehouses extend SQL with semantic operators that invoke large language models on each qualifying row, but the per-row inference cost is prohibitive at scale. Model cascades reduce this cost by routing most rows through a fast proxy model and delegating uncertain cases to an expensive oracle. Existing frameworks, […]

Execution-Verified Reinforcement Learning for Optimization Modeling

arXiv:2604.00442v1 Announce Type: new Abstract: Automating optimization modeling with LLMs is a promising path toward scalable decision intelligence, but existing approaches either rely on agentic pipelines built on closed-source LLMs with high inference latency, or fine-tune smaller LLMs using costly process supervision that often overfits to a single solver API. Inspired by reinforcement learning with […]

AutoEG: Exploiting Known Third-Party Vulnerabilities in Black-Box Web Applications

arXiv:2604.00704v1 Announce Type: cross Abstract: Large-scale web applications are widely deployed with complex third-party components, inheriting security risks arising from component vulnerabilities. Security assessment is therefore required to determine whether such known vulnerabilities remain practically exploitable in real applications. Penetration testing is a widely adopted approach that validates exploitability by launching concrete attacks against known […]

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

arXiv:2604.00813v1 Announce Type: cross Abstract: End-to-end autonomous driving has evolved from the conventional paradigm based on sparse perception into vision-language-action (VLA) models, which focus on learning language descriptions as an auxiliary task to facilitate planning. In this paper, we propose an alternative Vision-Geometry-Action (VGA) paradigm that advocates dense 3D geometry as the critical cue for […]

Spectral Compact Training: Pre-Training Large Language Models via Permanent Truncated SVD and Stiefel QR Retraction

arXiv:2604.00733v1 Announce Type: cross Abstract: The memory wall remains the primary bottleneck for training large language models on consumer hardware. We introduce Spectral Compact Training (SCT), a method that replaces dense weight matrices with permanent truncated SVD factors W = U diag(s) V^T, where the full dense matrix is never materialized during training or inference. […]

Towards Reliable Truth-Aligned Uncertainty Estimation in Large Language Models

arXiv:2604.00445v1 Announce Type: new Abstract: Uncertainty estimation (UE) aims to detect hallucinated outputs of large language models (LLMs) to improve their reliability. However, UE metrics often exhibit unstable performance across configurations, which significantly limits their applicability. In this work, we formalise this phenomenon as proxy failure, since most UE metrics originate from model behaviour, rather […]

Non-ignorable fuzziness in granular counts: the case of RNA-seq data

arXiv:2604.00763v1 Announce Type: cross Abstract: RNA-seq count data are often affected by read-to-gene alignment ambiguity, especially in high-dimensional transcriptomics. This type of ambiguity can be conveniently expressed through granular counts, namely fuzzy-valued observations of latent discrete quantities. We study a class of fuzzy-reporting mechanisms and show that, when reporting exploits graded membership, ignorability fails generically, […]

WARP: Guaranteed Inner-Layer Repair of NLP Transformers

arXiv:2604.00938v1 Announce Type: cross Abstract: Transformer-based NLP models remain vulnerable to adversarial perturbations, yet existing repair methods face a fundamental trade-off: gradient-based approaches offer flexibility but lack verifiability and often overfit; methods that do provide repair guarantees are restricted to the final layer or small networks, significantly limiting the parameter search space available for repair. […]

Stiff-FCS: Single-Cell Stiffness Profiling With Integrated Molecular and Functional Analysis

arXiv:2604.00467v1 Announce Type: new Abstract: Cell stiffness is a key determinant of how cells deform, migrate, and adapt to mechanically restrictive environments, yet existing single-cell stiffness assays remain difficult to combine with molecular analysis and downstream functional studies. To address these limitations, we introduce a microfluidic platform, stiffness-based ferrohydrodynamic cell sorting (Stiff-FCS), designed for high-throughput […]

A Bilevel Integer Programming Approach for the Synchronous Attractor Control Problem

arXiv:2604.01018v1 Announce Type: cross Abstract: Boolean networks are dynamical models of disease development in which the activation levels of genes are represented by binary variables. Given a Boolean network, controls represent mutations or medical treatments that fix the activation levels of selected genes so that all states in every attractor (i.e., long-term recurrent states) satisfy […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844