arXiv:2605.22874v1 Announce Type: new Abstract: Effectively translating between natural language (NL) and formal logics like Linear Temporal Logic (LTL) requires expertise that limits formal verification’s reach in safety-critical development. Template-based approaches sacrifice expressiveness for reliability; neural methods achieve fluency but provide no correctness guarantees. We present NeuroNL2LTL, a neurosymbolic architecture unifying learned translation with formal […]
A measurement substrate for agentic Kubernetes operations: Methodology and a case study in retrieval-compounding falsification
arXiv:2605.23058v1 Announce Type: cross Abstract: Empirical claims about autonomous Kubernetes operations agents are largely unfalsifiable. Published work reports observational results without controlled comparisons against an agent-disabled baseline, selection bias is endemic, pre-registered decision matrices are absent, and samples are typically too small for the noise level of the underlying scoring system. The cause is the […]
ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport
arXiv:2602.07235v2 Announce Type: replace-cross Abstract: Watermarking is an important tool for promoting the responsible use of large language models (LLMs). Existing watermarks insert a signal into generated tokens that either flags LLM-generated text (zero-bit watermarking) or encodes more complex messages (multi-bit watermarking). Though a number of recent approaches insert multiple bits into text without perturbing […]
Philosophical Dispositions as Behavioral Constraints for AI-Assisted Code Review: An Empirical Study
arXiv:2605.23108v1 Announce Type: cross Abstract: AI-assisted code review tools typically operate as generic “expert reviewer” agents, producing homogeneous findings regardless of the analysis type needed. We present a system that constrains AI reviewer behavior through philosophical dispositions — coherent personality lenses grounded in specific epistemological traditions (Pyrrhonist Skepticism, Navya-Ny=aya logic, Diogenes’ Cynicism, Confucian relational ethics) […]
Mediative Fuzzy Logic: From Type-1 Foundations to Type-2, Type-3 and Quantum Extensions
arXiv:2605.22900v1 Announce Type: new Abstract: Mediative Fuzzy Logic was conceived as a practical scheme for reconciling hesitant or conflicting assessments in fuzzy control and decision-making. However, its logical and semantic foundations remain underdeveloped, especially beyond operational type-1 settings. This article develops a unified account of the type-1 core together with interval type-2, granular type-3, and […]
Generative AI and the Reorganization of Labor Demand
arXiv:2605.23159v1 Announce Type: cross Abstract: Generative artificial intelligence (AI) is expected to transform work, but less is known about how firms reorganize labor demand as the technology diffuses. Existing research has largely focused on which occupations are exposed to AI or whether exposed jobs decline. We extend this debate by examining whether firms adjust by […]
Representation over Routing: Overcoming Surrogate Hacking in Multi-Timescale PPO
arXiv:2604.13517v3 Announce Type: replace-cross Abstract: Temporal credit assignment in reinforcement learning has long been a central challenge. Inspired by the multi-timescale encoding of the dopamine system in neurobiology, recent research has sought to introduce multiple discount factors into Actor-Critic architectures, such as Proximal Policy Optimization (PPO), to balance short-term responses with long-term planning. However, this […]
FastKernels: Benchmarking GPU Kernel Generation in Production
arXiv:2605.23215v1 Announce Type: cross Abstract: LLM-based agents for GPU kernel generation are advancing rapidly, yet their progress is fundamentally constrained by the benchmarks they optimize against. Existing benchmarks are poorly aligned with production inference frameworks: they evaluate kernels on a single GPU with synthetic inputs, ignore the surrounding compilation stack, and reward replicating known optimizations […]
EVE-Agent: Evidence-Verifiable Self-Evolving Agents
arXiv:2605.22905v1 Announce Type: new Abstract: Self-evolving agents should not train on examples they cannot justify. Data-free self-evolving search agents offer a scalable route to systems that generate their own questions, answer them, and improve from their own feedback without human annotations. Yet, without verifiable evidence, this loop can reward fluent but unsupported examples, turning the […]
ChainFlow-VLA: Causal Flow Planning with Vision-Language Models
arXiv:2605.23270v1 Announce Type: cross Abstract: Current end-to-end autonomous driving systems are fundamentally limited by a mismatch between temporal causal reasoning and global trajectory consistency. Autoregressive (AR) models capture interaction-aware temporal dependencies via causal factorization, but their step-wise decoding leads to error accumulation and suboptimal global structure. In contrast, diffusion models optimize trajectories globally but lack […]
Sutra: Tensor-Op RNNs as a Compilation Target for Vector Symbolic Architectures
arXiv:2605.20919v2 Announce Type: replace-cross Abstract: Sutra is a typed, purely functional programming language whose compiled forward pass is a PyTorch neural network. The compiler beta-reduces the whole program — primitives, control flow, string I/O — to one fused tensor-op graph over a frozen embedding substrate. Rotation binding, unbind, bundle, polynomial Kleene three-valued logic, and tail-recursive […]
Score-Based One-step MeanFlow Policy Optimization
arXiv:2605.23365v1 Announce Type: cross Abstract: Diffusion and flow matching have emerged as expressive policy classes in reinforcement learning, but their reliance on multi-step denoising imposes substantial computational overhead at inference time, which is particularly problematic in online RL. MeanFlow offers a promising alternative by learning an average velocity field that maps noise to data in […]