arXiv:2603.22327v1 Announce Type: cross Abstract: Systematic literature reviews are essential for synthesizing scientific evidence but are costly, difficult to scale and time-intensive, creating bottlenecks for evidence-based policy. We study whether large language models can automate the complete systematic review workflow, from article retrieval, article screening, data extraction to report synthesis. Applied to epidemiological reviews of […]
T-MAP: Red-Teaming LLM Agents with Trajectory-aware Evolutionary Search
arXiv:2603.22341v1 Announce Type: cross Abstract: While prior red-teaming efforts have focused on eliciting harmful text outputs from large language models (LLMs), such approaches fail to capture agent-specific vulnerabilities that emerge through multi-step tool execution, particularly in rapidly growing ecosystems such as the Model Context Protocol (MCP). To address this gap, we propose a trajectory-aware evolutionary […]
Reasoner-Executor-Synthesizer: Scalable Agentic Architecture with Static O(1) Context Window
arXiv:2603.22367v1 Announce Type: cross Abstract: Large Language Models (LLMs) deployed as autonomous agents commonly use Retrieval-Augmented Generation (RAG), feeding retrieved documents into the context window, which creates two problems: the risk of hallucination grows with context length, and token cost scales linearly with dataset size. We propose the Reasoner-Executor-Synthesizer (RES) architecture, a three-layer design that […]
Symbolic Graph Networks for Robust PDE Discovery from Noisy Sparse Data
arXiv:2603.22380v1 Announce Type: cross Abstract: Data-driven discovery of partial differential equations (PDEs) offers a promising paradigm for uncovering governing physical laws from observational data. However, in practical scenarios, measurements are often contaminated by noise and limited by sparse sampling, which poses significant challenges to existing approaches based on numerical differentiation or integral formulations. In this […]
Functional Component Ablation Reveals Specialization Patterns in Hybrid Language Model Architectures
arXiv:2603.22473v1 Announce Type: cross Abstract: Hybrid language models combining attention with state space models (SSMs) or linear attention offer improved efficiency, but whether both components are genuinely utilized remains unclear. We present a functional component ablation framework applied to two sub-1B hybrid models — Qwen3.5-0.8B (sequential: Gated DeltaNet + softmax attention) and Falcon-H1-0.5B (parallel: Mamba-2 […]
STRIATUM-CTF: A Protocol-Driven Agentic Framework for General-Purpose CTF Solving
arXiv:2603.22577v1 Announce Type: cross Abstract: Large Language Models (LLMs) have demonstrated potential in code generation, yet they struggle with the multi-step, stateful reasoning required for offensive cybersecurity operations. Existing research often relies on static benchmarks that fail to capture the dynamic nature of real-world vulnerabilities. In this work, we introduce STRIATUM-CTF (A Search-based Test-time Reasoning […]
LGSE: Lexically Grounded Subword Embedding Initialization for Low-Resource Language Adaptation
arXiv:2603.22629v1 Announce Type: cross Abstract: Adapting pretrained language models to low-resource, morphologically rich languages remains a significant challenge. Existing vocabulary expansion methods typically rely on arbitrarily segmented subword units, resulting in fragmented lexical representations and loss of critical morphological information. To address this limitation, we propose the Lexically Grounded Subword Embedding Initialization (LGSE) framework, which […]
DALDALL: Data Augmentation for Lexical and Semantic Diverse in Legal Domain by leveraging LLM-Persona
arXiv:2603.22765v1 Announce Type: cross Abstract: Data scarcity remains a persistent challenge in low-resource domains. While existing data augmentation methods leverage the generative capabilities of large language models (LLMs) to produce large volumes of synthetic data, these approaches often prioritize quantity over quality and lack domain-specific strategies. In this work, we introduce DALDALL, a persona-based data […]
Agent-Sentry: Bounding LLM Agents via Execution Provenance
arXiv:2603.22868v1 Announce Type: cross Abstract: Agentic computing systems, which autonomously spawn new functionalities based on natural language instructions, are becoming increasingly prevalent. While immensely capable, these systems raise serious security, privacy, and safety concerns. Fundamentally, the full set of functionalities offered by these systems, combined with their probabilistic execution flows, is not known beforehand. Given […]
From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents
arXiv:2603.22386v1 Announce Type: new Abstract: Large language model (LLM)-based systems are becoming increasingly popular for solving tasks by constructing executable workflows that interleave LLM calls, information retrieval, tool use, code execution, memory updates, and verification. This survey reviews recent methods for designing and optimizing such workflows, which we treat as agentic computation graphs (ACGs). We […]
Sketching a Space of Brain States
arXiv:2603.22296v1 Announce Type: new Abstract: Brain functional connectivity alterations, that is, pathological changes in the signal exchange between areas of the brain, occur in several neurological diseases, including neurodegenerative and neuropsychiatric ones. They consist in changes in how brain functional networks operate. By conceptualising a brain space as a space whose points are connectome configurations […]
Not All Tokens Are Created Equal: Query-Efficient Jailbreak Fuzzing for LLMs
arXiv:2603.23269v1 Announce Type: cross Abstract: Large Language Models(LLMs) are widely deployed, yet are vulnerable to jailbreak prompts that elicit policy-violating outputs. Although prior studies have uncovered these risks, they typically treat all tokens as equally important during prompt mutation, overlooking the varying contributions of individual tokens to triggering model refusals. Consequently, these attacks introduce substantial […]