arXiv:2603.29651v1 Announce Type: cross Abstract: Semantic interaction (SI) enables analysts to incorporate their cognitive processes into AI models through direct manipulation of visualizations. While SI frameworks for narrative extraction have been proposed, empirical evaluations of their effectiveness remain limited. This paper presents a user study that evaluates SI for narrative map sensemaking, involving 33 participants […]
A Convex Route to Thermomechanics: Learning Internal Energy and Dissipation
arXiv:2603.28707v2 Announce Type: replace-cross Abstract: We present a physics-based neural network framework for the discovery of constitutive models in fully coupled thermomechanics. In contrast to classical formulations based on the Helmholtz energy, we adopt the internal energy and a dissipation potential as primary constitutive functions, expressed in terms of deformation and entropy. This choice avoids […]
CLAUSE: Agentic Neuro-Symbolic Knowledge Graph Reasoning via Dynamic Learnable Context Engineering
arXiv:2509.21035v2 Announce Type: replace Abstract: Knowledge graphs provide structured context for multi-hop question answering, but deployed systems must balance answer accuracy with strict latency and cost targets while preserving provenance. Static k-hop expansions and “think-longer” prompting often over-retrieve, inflate context, and yield unpredictable runtime. We introduce CLAUSE, an agentic three-agent neuro-symbolic framework that treats context […]
CoMaTrack: Competitive Multi-Agent Game-Theoretic Tracking with Vision-Language-Action Models
arXiv:2603.22846v2 Announce Type: replace Abstract: Embodied Visual Tracking (EVT), a core dynamic task in embodied intelligence, requires an agent to precisely follow a language-specified target. Yet most existing methods rely on single-agent imitation learning, suffering from costly expert data and limited generalization due to static training environments. Inspired by competition-driven capability evolution, we propose CoMaTrack, […]
IMAGAgent: Orchestrating Multi-Turn Image Editing via Constraint-Aware Planning and Reflection
arXiv:2603.29602v1 Announce Type: cross Abstract: Existing multi-turn image editing paradigms are often confined to isolated single-step execution. Due to a lack of context-awareness and closed-loop feedback mechanisms, they are prone to error accumulation and semantic drift during multi-turn interactions, ultimately resulting in severe structural distortion of the generated images. For that, we propose textbfIMAGAgent, a […]
SecureVibeBench: Evaluating Secure Coding Capabilities of Code Agents with Realistic Vulnerability Scenarios
arXiv:2509.22097v2 Announce Type: replace-cross Abstract: Large language model-powered code agents are rapidly transforming software engineering, yet the security risks of their generated code have become a critical concern. Existing benchmarks have provided valuable insights, but they fail to capture scenarios in which vulnerabilities are actually introduced by human developers, making fair comparisons between humans and […]
Heracles: Bridging Precise Tracking and Generative Synthesis for General Humanoid Control
arXiv:2603.27756v2 Announce Type: replace-cross Abstract: Achieving general-purpose humanoid control requires a delicate balance between the precise execution of commanded motions and the flexible, anthropomorphic adaptability needed to recover from unpredictable environmental perturbations. Current general controllers predominantly formulate motion control as a rigid reference-tracking problem. While effective in nominal conditions, these trackers often exhibit brittle, non-anthropomorphic […]
The Mouth is Not the Brain: Bridging Energy-Based World Models and Language Generation
arXiv:2601.17094v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) generate fluent text, yet whether they truly understand the world or merely produce plausible texts about it remains contested. We propose an architectural principle, the mouth is not the brain, that explicitly separates world models from language models. Our architecture comprises three components: a DBM that […]
Dynamic Cogeneration of Bug Reproduction Test in Agentic Program Repair
arXiv:2601.19066v2 Announce Type: replace-cross Abstract: Bug Reproduction Tests (BRTs) have been used in many Automated Program Repair (APR) systems, primarily for validating promising fixes and aiding fix generation. In practice, when developers submit a patch, they often implement the BRT alongside the fix. Our experience deploying agentic APR reveals that developers similarly desire a BRT […]
Magic Words or Methodical Work? Challenging Conventional Wisdom in LLM-Based Political Text Annotation
arXiv:2603.26898v2 Announce Type: replace-cross Abstract: Political scientists are rapidly adopting large language models (LLMs) for text annotation, yet the sensitivity of annotation results to implementation choices remains poorly understood. Most evaluations test a single model or configuration; how model choice, model size, learning approach, and prompt style interact, and whether popular “best practices” survive controlled […]
Modeling Spatiotemporal Neural Frames for High Resolution Brain Dynamic
arXiv:2603.24176v2 Announce Type: replace-cross Abstract: Capturing dynamic spatiotemporal neural activity is essential for understanding large-scale brain mechanisms. Functional magnetic resonance imaging (fMRI) provides high-resolution cortical representations that form a strong basis for characterizing fine-grained brain activity patterns. The high acquisition cost of fMRI limits large-scale applications, therefore making high-quality fMRI reconstruction a crucial task. Electroencephalography […]
TSHA: A Benchmark for Visual Language Models in Trustworthy Safety Hazard Assessment Scenarios
arXiv:2603.29759v1 Announce Type: cross Abstract: Recent advances in vision-language models (VLMs) have accelerated their application to indoor safety hazards assessment. However, existing benchmarks suffer from three fundamental limitations: (1) heavy reliance on synthetic datasets constructed via simulation software, creating a significant domain gap with real-world environments; (2) oversimplified safety tasks with artificial constraints on hazard […]