POP: Prefill-Only Pruning for Efficient Large Model Inference

arXiv:2602.03295v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) and Vision-Language Models (VLMs) have demonstrated remarkable capabilities. However, their deployment is hindered by significant computational costs. Existing structured pruning methods, while hardware-efficient, often suffer from significant accuracy degradation. In this paper, we argue that this failure stems from a stage-agnostic pruning approach that overlooks the […]

Three-Phase Transformer

arXiv:2604.14430v1 Announce Type: cross Abstract: We present Three-Phase Transformer (3PT), a residual-stream structural prior for decoder-only Transformers on a standard SwiGLU + RMSNorm + RoPE + GQA backbone. The hidden vector is partitioned into N equally-sized cyclic channels, each maintained by phase-respecting ops: a per-channel RMSNorm, a 2D Givens rotation between attention and FFN that […]

Seeing Through Circuits: Faithful Mechanistic Interpretability for Vision Transformers

arXiv:2604.14477v1 Announce Type: new Abstract: Transparency of neural networks’ internal reasoning is at the heart of interpretability research, adding to trust, safety, and understanding of these models. The field of mechanistic interpretability has recently focused on studying task-specific computational graphs, defined by connections (edges) between model components. Such edge-based circuits have been defined in the […]

Crowdsourcing of Real-world Image Annotation via Visual Properties

arXiv:2604.14449v1 Announce Type: cross Abstract: Recent advances in data-centric artificial intelligence highlight inherent limitations in object recognition datasets. One of the primary issues stems from the semantic gap problem, which results in complex many-to-many mappings between visual data and linguistic descriptions. This bias adversely affects performance in computer vision tasks. This paper proposes an image […]

Exact Structural Abstraction and Tractability Limits

arXiv:2604.07349v5 Announce Type: replace-cross Abstract: Any rigorously specified problem determines an admissible-output relation $R$, and exact correctness depends only on the induced classes $s sim_R s’ iff mathrmAdm_R(s)=mathrmAdm_R(s’)$. Exact relevance certification asks which coordinates recover those classes. Decision, search, approximation, statistical, randomized, horizon, and distributional guarantees all reduce to this same quotient-recovery problem. Tractable cases […]

A Nonasymptotic Theory of Gain-Dependent Error Dynamics in Behavior Cloning

arXiv:2604.14484v1 Announce Type: cross Abstract: Behavior cloning (BC) policies on position-controlled robots inherit the closed-loop response of the underlying PD controller, yet the effect of controller gains on BC failure lacks a nonasymptotic theory. We show that independent sub-Gaussian action errors propagate through the gain-dependent closed-loop dynamics to yield sub-Gaussian position errors whose proxy matrix […]

Synchronized disease and behavioural dynamics in weakly coupled populations

arXiv:2604.14483v1 Announce Type: new Abstract: The spread of infectious disease is strongly influenced by social dynamics. In addition to infection risk, individuals vaccination decisions depend on prevailing social behavior: high infection levels and widespread vaccination can increase vaccine uptake, which in turn suppresses infection. This feedback can generate sustained oscillations in disease prevalence and vaccination […]

CBCL: Safe Self-Extending Agent Communication

arXiv:2604.14512v1 Announce Type: cross Abstract: Agent communication languages (ACLs) enable heterogeneous agents to share knowledge and coordinate across diverse domains. This diversity demands extensibility, but expressive extension mechanisms can push the input language beyond the complexity classes where full validation is tractable. We present CBCL (Common Business Communication Language), an agent communication language that constrains […]

Generative Augmented Inference

arXiv:2604.14575v1 Announce Type: cross Abstract: Data-driven operations management often relies on parameters estimated from costly human-generated labels. Recent advances in large language models (LLMs) and other AI systems offer inexpensive auxiliary data, but introduce a new challenge: AI outputs are not direct observations of the target outcomes, but could involve high-dimensional representations with complex and […]

Pushing the Limits of On-Device Streaming ASR: A Compact, High-Accuracy English Model for Low-Latency Inference

arXiv:2604.14493v1 Announce Type: new Abstract: Deploying high-quality automatic speech recognition (ASR) on edge devices requires models that jointly optimize accuracy, latency, and memory footprint while operating entirely on CPU without GPU acceleration. We conduct a systematic empirical study of state-of-the-art ASR architectures, encompassing encoder-decoder, transducer, and LLM-based paradigms, evaluated across batch, chunked, and streaming inference […]

Context Over Content: Exposing Evaluation Faking in Automated Judges

arXiv:2604.15224v1 Announce Type: new Abstract: The $textitLLM-as-a-judge$ paradigm has become the operational backbone of automated AI evaluation pipelines, yet rests on an unverified assumption: that judges evaluate text strictly on its semantic content, impervious to surrounding contextual framing. We investigate $textitstakes signaling$, a previously unmeasured vulnerability where informing a judge model of the downstream consequences […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844