arXiv:2605.21522v1 Announce Type: new Abstract: Protein-protein interactions (PPIs) govern nearly all cellular processes, yet computational methods for identifying binding partners typically produce ranked predictions without mechanistic justification. This creates a fundamental barrier to adoption because biologists cannot assess whether predictions reflect genuine biochemical insight or spurious correlations. We present textbfProtein Thoughts, a framework that reformulates […]
The Distillation Game: Adaptive Attacks & Efficient Defenses
arXiv:2605.22737v1 Announce Type: cross Abstract: Distillation attacks create a deployment trade-off for model providers: the same outputs that make a model more useful can also make it easier to imitate. We study this trade-off through a minimax game between a utility-constrained teacher and an adaptive student. Our framework yields tractable one-sided response rules: an adaptive […]
Beyond Temperature: Hyperfitting as a Late-Stage Geometric Expansion
arXiv:2605.22579v1 Announce Type: cross Abstract: Recent work has identified a counterintuitive phenomenon termed “Hyperfitting”, where fine-tuning Large Language Models (LLMs) to near-zero training loss on small datasets surprisingly enhances open-ended generation quality and mitigates repetition in greedy decoding. While effective, the underlying mechanism remains poorly understood, with the extremely low-entropy output distributions suggesting a potential […]
A Constant-Time Implementation Methodology for Activation Functions on Microcontrollers
arXiv:2605.22441v1 Announce Type: cross Abstract: Embedded neural-network inference can leak information through timing side channels, including leakage caused by the evaluation of activation functions. This work proposes a constant-time implementation methodology for activation functions on embedded microcontrollers and validates it on ReLU, sigmoid, tanh, GELU, and Swish on an ARM Cortex-M4 platform. The proposed methodology […]
Holder Policy Optimisation
arXiv:2605.12058v2 Announce Type: replace-cross Abstract: Group Relative Policy Optimisation (GRPO) enhances large language models by estimating advantages across a group of sampled trajectories. However, mapping these trajectory-level advantages to policy updates requires aggregating token-level probabilities within each sequence. Relying on a fixed aggregation mechanism for this step fundamentally limits the algorithm’s adaptability. Empirically, we observe […]
The Augmentation Trap: AI Productivity and the Cost of Cognitive Offloading
arXiv:2604.03501v3 Announce Type: replace-cross Abstract: Experimental evidence confirms that AI tools raise worker productivity, but also that sustained use can erode the expertise on which those gains depend. We develop a dynamic model in which a decision-maker chooses AI usage intensity for a worker over time, trading immediate productivity against the erosion of worker skill. […]
Walking the Tightrope of LLMs for Software Development: A Practitioners’ Perspective
arXiv:2511.06428v2 Announce Type: replace-cross Abstract: Background: Large Language Models emerged with the potential of provoking a revolution in software development (e.g., automating processes, workforce transformation). Although studies have started to investigate the perceived impact of LLMs for software development, there is a need for empirical studies to comprehend how to balance forward and backward effects […]
X-SYNTH: Beyond Retrieval — Enterprise Context Synthesis from Observed Digital Human Attention
arXiv:2605.15505v2 Announce Type: replace Abstract: In enterprise operations, the context required for an AI agent task is scattered across systems of record, static information stores, and communication channels. What is stored is system state, a lossy representation of the work that actually happened. The prevailing approach retrieves by matching request content to what is stored; […]
When Grammar Guides the Attack: Uncovering Control-Plane Vulnerabilities in LLMs with Structured Output
arXiv:2503.24191v3 Announce Type: replace-cross Abstract: Content Warning: This paper may contain unsafe or harmful content generated by LLMs that may be offensive to readers. Large Language Models (LLMs) increasingly serve as tooling platforms through structured output APIs, but the grammar-guided decoding that powers this feature opens a critical control-plane attack surface orthogonal to traditional data-plane […]
Billion-Scale Graph Foundation Models
arXiv:2602.04768v2 Announce Type: replace-cross Abstract: Graph-structured data underpins many critical applications. While foundation models have transformed language and vision via large-scale pretraining and lightweight adaptation, extending this paradigm to general, real-world graphs is challenging. In this work, we present Graph Billion-Foundation-Fusion (GraphBFF): an end-to-end recipe for building billion-parameter Graph Foundation Models (GFMs) for large-scale heterogeneous […]
Trees to Flows and Back: Unifying Decision Trees and Diffusion Models
arXiv:2605.00414v2 Announce Type: replace-cross Abstract: Decision trees and diffusion models are ostensibly disparate model classes, one discrete and hierarchical, the other continuous and dynamic. This work unifies the two by establishing a crisp mathematical correspondence between hierarchical decision trees and diffusion processes in appropriate limiting regimes. Our unification reveals a shared optimization principle: emphGlobal Trajectory […]
Lens Privacy Sealing: A New Benchmark and Method for Physical Privacy-Preserving Action Recognition
arXiv:2605.19578v2 Announce Type: replace-cross Abstract: RGB camera-based surveillance systems enable human action recognition for public safety and healthcare, yet raise serious privacy concerns. Existing methods rely on post-capture algorithms, which fail to protect privacy during data acquisition. We propose Lens Privacy Sealing (LPS), a simple hardware solution that physically obscures camera lenses with adjustable laminating […]