Reasoning-Guided Grounding: Elevating Video Anomaly Detection through Multimodal Large Language Models

arXiv:2605.02912v1 Announce Type: cross Abstract: Video Anomaly Detection (VAD) has traditionally been framed as binary classification or outlier detection, providing neither interpretable reasoning nor precise spatial localization of anomalous events. While Vision-Language Models (VLMs) offer rich scene understanding, they struggle with reliable spatial grounding – often producing hallucinated or geometrically invalid bounding boxes when asked […]

Benchmarking Document Parsers on Mathematical Formula Extraction from PDFs

arXiv:2512.09874v2 Announce Type: replace-cross Abstract: Correctly parsing mathematical formulas from PDFs is critical for training large language models and building scientific knowledge bases from academic literature, yet existing benchmarks either exclude formulas entirely or lack semantically-aware evaluation metrics. We introduce a benchmarking framework centered on synthetically generated PDFs with precise LaTeX ground truth, enabling systematic […]

Delay, Plateau, or Collapse: Evaluating the Impact of Systematic Verification Error on RLVR

arXiv:2605.02909v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become a powerful approach for improving the reasoning capabilities of large language models (LLMs). While RLVR is designed for tasks with verifiable ground-truth answers, real-world verifiers (e.g., static code checkers) can introduce errors into the reward signal. Prior analyses have largely treated such […]

DynoSys: A Dynamic Systems Framework for Multimodal Integration of Genetic, Environmental, and Neurobiological Signals

arXiv:2605.02952v1 Announce Type: new Abstract: Understanding the development of adolescent behavioral and mental health outcomes requires integrating genetic predisposition, environmental exposures, and neurobiological processes over time. Here, we present a unified quantitative framework that models the human body as a dynamic system, where genetic factors form the foundational state, environmental exposures act as time-varying inputs, […]

Multi Language Models for On-the-Fly Syntax Highlighting

arXiv:2510.04166v2 Announce Type: replace-cross Abstract: Syntax highlighting is a critical feature in modern software development environments, enhancing code readability and developer productivity. However, delivering accurate highlighting in real time remains challenging for online and web-based development tools due to strict time and memory constraints on backend services. These systems must serve highlights rapidly and frequently, […]

Programmatic Context Augmentation for LLM-based Symbolic Regression

arXiv:2605.03101v1 Announce Type: new Abstract: Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolutionary methods, have proven useful but suffer from scalability and expressivity limitations. Recently, large language model (LLM)-based evolutionary search […]

Mechanism-Faithful Queueing Simulation Model Translation with Large Language Model Support

arXiv:2601.06543v2 Announce Type: replace-cross Abstract: Queueing simulation studies often require substantial manual effort to translate conceptual system descriptions into executable programs and to verify that the implemented mechanisms match the intended queueing logic. Although large language models (LLMs) may produce executable scripts, executability alone is insufficient when arrival, routing, interruption, or reporting logic is wrong. […]

When Safety Geometry Collapses: Fine-Tuning Vulnerabilities in Agentic Guard Models

arXiv:2605.02914v1 Announce Type: cross Abstract: A guard model fine-tuned on entirely benign data can lose all safety alignment — not through adversarial manipulation, but through standard domain specialization. We demonstrate this failure across three purpose-built safety classifiers — LlamaGuard, WildGuard, and Granite Guardian — deployed as protection layers in agentic AI pipelines, and show that […]

VCBench: Benchmarking LLMs in Venture Capital

arXiv:2509.14448v2 Announce Type: replace Abstract: Benchmarks such as SWE-bench and ARC-AGI demonstrate how shared datasets accelerate progress toward artificial general intelligence (AGI). We introduce VCBench, the first benchmark for predicting founder success in venture capital (VC), a domain where signals are sparse, outcomes are uncertain, and even top investors perform modestly. At inception, the market […]

TRACE: A Metrologically-Grounded Engineering Framework for Trustworthy Agentic AI Systems in Operationally Critical Domains

arXiv:2605.03838v1 Announce Type: cross Abstract: We introduce TRACE, a cross-domain engineering framework for trustworthy agentic AI in operationally critical domains. TRACE combines a four-layer reference architecture with an explicit classical-ML vs. LLM-validator split (L2a/L2b), a stateful orchestration-and-escalation policy (L3), and bounded human supervision (L4); a metrologically grounded trust-metric suite mapped to GUM/VIM/ISO 17025; and a […]

Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

arXiv:2605.03969v1 Announce Type: cross Abstract: AI-generated text is nowadays produced at scale across domains and heterogeneous generation pipelines, making robustness to distribution shift a central requirement for supervised binary detectors. We train transformer-based detectors on HC3 PLUS and calibrate a single decision threshold by maximising balanced accuracy on held-out validation; this threshold is then kept […]

A second-order method landing on the Stiefel manifold via Newton$unicodex2013$Schulz iteration

arXiv:2605.02838v2 Announce Type: replace-cross Abstract: Retraction-free approaches offer attractive low-cost alternatives to Riemannian methods on the Stiefel manifold, but they are often first-order, which may limit the efficiency under high-accuracy requirements. To this end, we propose a second-order method landing on the Stiefel manifold without invoking retractions, which is proved to enjoy local quadratic (or […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844