TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability

arXiv:2605.14738v2 Announce Type: replace-cross Abstract: Recent work has promoted task-aware layer pruning as a way to improve model performance on particular tasks, as shown by TALE. In this paper, we investigate when such improvements occur and why. We show first that, across controlled polynomial regression tasks and large language models, such pruning yields no benefit […]

VGenST-Bench: A Benchmark for Spatio-Temporal Reasoning via Active Video Synthesis

arXiv:2605.22570v1 Announce Type: cross Abstract: Spatio-temporal reasoning is a core capability for Multimodal Large Language Models (MLLMs) operating in the real world. As such, evaluating it precisely has become an essential challenge. However, existing spatio-temporal reasoning benchmark datasets primarily rely on static image sets or passively curated video data, which limits the evaluation of fine-grained […]

Post-Training is About States, Not Tokens: A State Distribution View of SFT, RL, and On-Policy Distillation

arXiv:2605.22731v1 Announce Type: cross Abstract: Large language model post-training methods such as supervised fine-tuning (SFT), reinforcement learning (RL), and distillation are often analyzed through their loss functions: maximum likelihood, policy gradients, forward KL, reverse KL, or related objective-level variants. We study a complementary factor: the state distribution on which supervision is applied. For an autoregressive […]

UniSD: Towards a Unified Self-Distillation Framework for Large Language Models

arXiv:2605.06597v2 Announce Type: replace-cross Abstract: Self-distillation (SD) offers a promising path for adapting large language models (LLMs) without relying on stronger external teachers. However, SD in autoregressive LLMs remains challenging because self-generated trajectories are free-form, correctness is task-dependent, and plausible rationales can still provide unstable or unreliable supervision. Existing methods mainly examine isolated design choices, […]

Quantifying Rodda and Graham Gait Classification from 3D Makerless Kinematics derived from a Single-view Video in a Heterogeneous Pediatric Clinical Cohort

arXiv:2605.11314v2 Announce Type: replace-cross Abstract: Cerebral Palsy (CP) is a neurological disorder of movement and the most common cause of lifelong physical disability in childhood. Approximately 75% of children with CP are ambulatory, and accurate gait assessment is central to preserving walking function, which deteriorates by mid-adulthood in a quarter to half of adults with […]

Understanding Multimodal Failure in Action-Chunking Behavioral Cloning

arXiv:2605.22493v1 Announce Type: cross Abstract: Behavioral cloning becomes difficult when the same observation admits several valid actions. We study this problem for action-chunking policies and show that different multimodal parameterizations fail in different ways. For latent-variable policies, posterior-prior regularization makes deployment-time sampling more reliable, but excessive regularization removes the action-conditioned information needed to distinguish demonstrated […]

Characterizing the Fault Response of the Intel Neural Compute Stick 2 Under Single-Pulse Electromagnetic Fault Injection

arXiv:2605.22437v1 Announce Type: cross Abstract: Vision processing units and other commercial neural-network inference accelerators are increasingly deployed in safety-relevant edge applications, but their fault response under transient hardware disturbances remains poorly characterized in the open literature. For the Intel Movidius Myriad X, packaged as the Intel Neural Compute Stick 2 (NCS2), only a single feasibility […]

Beyond Temperature: Hyperfitting as a Late-Stage Geometric Expansion

arXiv:2605.22579v1 Announce Type: cross Abstract: Recent work has identified a counterintuitive phenomenon termed “Hyperfitting”, where fine-tuning Large Language Models (LLMs) to near-zero training loss on small datasets surprisingly enhances open-ended generation quality and mitigates repetition in greedy decoding. While effective, the underlying mechanism remains poorly understood, with the extremely low-entropy output distributions suggesting a potential […]

SPECTRA: Spectral Domain-Aware Graph Generation for Imbalanced Molecular Property Regression

arXiv:2511.04838v2 Announce Type: replace-cross Abstract: Molecular property regression struggles with cases in chemically relevant target ranges that are underrepresented in datasets. Standard average error minimization approaches underperform in these highly relevant cases, and oversampling approaches lead to meaningless molecular representations. In this paper, we propose SPECTRA, a spectral, domain-aware graph generation method designed to improve […]

Moral Semantics Survive Machine Translation: Cross-Lingual Evidence from Moral Foundations Corpora

arXiv:2605.22660v1 Announce Type: cross Abstract: Moral language is subtle and culturally variable, making it difficult to translate faithfully across languages. Idiomatic expressions, slang, and cultural references introduce hard-to-avoid translation artifacts. Yet automated moral values classification depends on language-specific annotated corpora that exist almost exclusively in English. We investigate whether LLM-based translation can bridge this gap, […]

Rethinking Forward Processes for Score-Based Nonlinear Data Assimilation in High Dimensions

arXiv:2604.02889v2 Announce Type: replace-cross Abstract: Data assimilation is the process of estimating the state of a dynamical system over time by combining model predictions with measurements. This task becomes challenging when the system is nonlinear and high-dimensional. To address this, score-based Bayesian filters have recently emerged. However, these methods still show unsatisfactory performance in certain […]

The Distillation Game: Adaptive Attacks & Efficient Defenses

arXiv:2605.22737v1 Announce Type: cross Abstract: Distillation attacks create a deployment trade-off for model providers: the same outputs that make a model more useful can also make it easier to imitate. We study this trade-off through a minimax game between a utility-constrained teacher and an adaptive student. Our framework yields tractable one-sided response rules: an adaptive […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844