arXiv:2605.11034v1 Announce Type: cross Abstract: We present MambaNetBurst, a compact tokenizer-free byte-level sequence classifier for network burst classification based on a Mamba-2 backbone. In contrast to most recent strong traffic-classification and intrusion-detection approaches, our method operates directly on raw packet bytes, avoids tokenization, patching, and heavy engineered multimodal representations, and does not require any self-supervised […]
CVEvolve: Autonomous Algorithm Discovery for Unstructured Scientific Data Processing
arXiv:2605.11359v1 Announce Type: new Abstract: Scientific data processing often requires task-specific algorithms or AI models, creating a barrier for domain scientists who need to analyze their data but may not have extensive computing or image-processing expertise. This barrier is especially pronounced when data are noisy, have a high dynamic range, are sparsely labeled, or are […]
Read, Extract, Classify: A Tool for Smarter Requirements Engineering
arXiv:2605.11045v1 Announce Type: cross Abstract: This paper presents the ReXCL tool, which automates the extraction and classification processes in requirements engineering, enhancing the software development life-cycle. The tool features two main modules: Extraction, which processes raw requirement documents into a predefined schema using heuristics and predictive modeling, and Classification, which assigns class labels to requirements […]
Beyond GRPO and On-Policy Distillation: An Empirical Sparse-to-Dense Reward Principle for Language-Model Post-Training
arXiv:2605.12483v1 Announce Type: cross Abstract: In settings where labeled verifiable training data is the binding constraint, each checked example should be allocated carefully. The standard practice is to use this data directly on the model that will be deployed, for example by running GRPO on the deployment student. We argue that this is often an […]
MCPShield: Content-Aware Attack Detection for LLM Agent Tool-Call Traffic
arXiv:2605.11053v1 Announce Type: cross Abstract: The Model Context Protocol (MCP) has become a widely adopted interface for LLM agents to invoke external tools, yet learned monitoring of MCP tool-call traffic remains underexplored. In this article, MCPShield is presented as an attack detection framework for MCP tool-call traffic that encodes each agent session as a graph […]
Causal Bias Detection in Generative Artifical Intelligence
arXiv:2605.11365v1 Announce Type: new Abstract: Automated systems built on artificial intelligence (AI) are increasingly deployed across high-stakes domains, raising critical concerns about fairness and the perpetuation of demographic disparities that exist in the world. In this context, causal inference provides a principled framework for reasoning about fairness, as it links observed disparities to underlying mechanisms […]
Newton’s Lantern: A Reinforcement Learning Framework for Finetuning AC Power Flow Warm Start Models
arXiv:2605.11102v1 Announce Type: cross Abstract: Neural warm starts can sharply reduce the number of Newton-Raphson iterations required to solve the AC power flow problem, but existing supervised approaches generalize poorly on heavily loaded instances near voltage collapse. We prove a lower bound on the Newton-Raphson iteration count that depends on the direction of the warm […]
Viral population dynamics at the cellular level, considering the replication cycle
arXiv:2510.14481v3 Announce Type: replace Abstract: We develop a stochastic framework for viral population dynamics at the cellular level that explicitly incorporates the replication cycle with random stage durations. The model is formulated as a structured birth-death process coupled with a renewal description of intracellular progression, allowing for general distributions of stage completion times. Within this […]
HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series
arXiv:2605.11130v1 Announce Type: cross Abstract: Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is scarce because such events are rare and costly to annotate. We introduce HEPA (Horizon-conditioned Event Predictive Architecture), built on two key principles. First, a causal Transformer encoder is pretrained via a […]
Causal Algorithmic Recourse: Foundations and Methods
arXiv:2605.11373v1 Announce Type: new Abstract: The trustworthiness of AI decision-making systems is increasingly important. A key feature of such systems is the ability to provide recommendations for how an individual may reverse a negative decision, a problem known as algorithmic recourse. Existing approaches treat recourse outcomes as counterfactuals of a fixed unit, ignoring that real-world […]
Quantifying the Reconstructability of Astrophysical Methods with Large Language Models and Information Theory: A Case Study in Spectral Reconstruction
arXiv:2605.11154v1 Announce Type: cross Abstract: Modern astrophysical studies rely heavily on complex data analysis pipelines; however, published descriptions often lack the detail required for computational reproducibility. In this work, we present an information-theoretic framework to quantify how effectively a method can be reconstructed from its written description. By treating algorithmic reconstruction as a probability distribution […]
VASR: Variance-Aware Systematic Resampling for Reward-Guided Diffusion
arXiv:2604.06779v2 Announce Type: replace Abstract: Sequential Monte Carlo (SMC) samplers for reward-guided diffusion models often suffer from rapid lineage collapse: a few high-reward particles dominate the population within a handful of resampling steps, destroying diversity and degrading sample quality. We propose a variance-decomposition framework for reward-guided diffusion SMC that separates continuation variance $V_t^mathrmcont$ from residual […]