PaSBench-Video: A Streaming Video Benchmark for Proactive Safety Warning

arXiv:2606.02443v1 Announce Type: cross Abstract: Between the first visible sign of danger and the moment an accident occurs, there is often a window where intervention remains possible. Video-capable multimodal large language models (MLLMs) could serve as always-on safety monitors that issue warnings during this window. Yet current benchmarks do not test this ability: they rely […]

DiffCrossGait: Trajectory-Level Alignment for 2D-3D Cross-Modal Gait Recognition via Latent Diffusion

arXiv:2606.00153v1 Announce Type: cross Abstract: Cross-modal 2D-3D gait recognition is impeded by inherent domain discrepancies between 2D silhouette and 3D LiDAR range-view representations. While prior methods align only final embeddings, we propose DiffCrossGait, which reformulates cross-modal matching as trajectory-level alignment in an identity-relevant latent diffusion space, rather than assuming full equivalence between 2D and 3D […]

Capability Self-Assessment: Teaching LLMs to Know Their Limits

arXiv:2606.00251v1 Announce Type: new Abstract: The ability to recognize one’s own limitations and decide whether to solve a problem or delegate is fundamental for reliable intelligent systems. Yet we show that modern large language models systematically lack this ability: across diverse model families and scales, they overestimate their competence and attempt queries they cannot solve. […]

ChurnNet: A Optimized Modern AI for Churn Prediction

arXiv:2606.00169v1 Announce Type: cross Abstract: Increased competition and the growing similarity of products and services offered by retailers have lowered the barriers for customers to switch to competitors. Accurate churn prediction can be a valuable tool for driving effective personalized marketing campaigns and helping to reduce customer attrition. This study evaluates the performance of traditional […]

ACON: Optimizing Context Compression for Long-horizon LLM Agents

arXiv:2510.00615v3 Announce Type: replace Abstract: Large language models (LLMs) are increasingly deployed as agents in dynamic real-world environments, where success depends on maintaining precise records of actions and observations. However, the resulting unbounded context growth in long-horizon agentic tasks makes two critical bottlenecks: prohibitive inference memory costs and reasoning degradation due to irrelevant information. Existing […]

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

arXiv:2606.00202v1 Announce Type: cross Abstract: Standard machine learning pipelines often admit many near-optimal models. These “Rashomon sets” pose a range of challenges and opportunities for uncertainty-aware, robust decision making. They allow users to incorporate domain knowledge and preferences that would otherwise be difficult to specify directly in an objective, and they quantify diversity among valid […]

Closed-Loop Neural Activation Control in Vision-Language-Action Models

arXiv:2606.00269v1 Announce Type: new Abstract: Vision-Language-Action (VLA) models can be steered at test time by intervening on semantically meaningful internal directions, but existing methods use a fixed steering coefficient, effectively operating in open loop. This is poorly suited to embodied control, where task state and concept error evolve over time, often causing overcorrection, oscillation, and […]

When Softmax Fails at the Top: Extreme Value Corrections for InfoNCE

arXiv:2606.00262v1 Announce Type: cross Abstract: InfoNCE is the standard contrastive learning objective, but its softmax form is not only a computational convenience: it also encodes a statistical assumption about how the top-scoring example is selected. Using extreme value theory, we show that this assumption is often misaligned with the normalized embedding setting used in modern […]

How Generation Architecture Shapes Code Complexity in Multi-Agent LLM Systems: A Paired Study on HumanEval

arXiv:2606.00308v1 Announce Type: cross Abstract: Large-language-model code generation has shifted from single-shot prompting to multi-agent orchestrations – analyst, coder, tester, and debugger pipelines – and is evaluated almost exclusively on functional correctness. Whether these architectures also affect the structural complexity of the code they produce, and which orchestration layers carry the cost, remains largely unexamined: […]

Robust Shielding for Safe Reinforcement Learning

arXiv:2606.00270v1 Announce Type: new Abstract: Shielding is an effective approach to formally guarantee the safety of reinforcement learning agents in Markov decision processes (MDPs). However, existing shielding techniques typically assume knowledge of the safety-relevant transition dynamics – a requirement that is seldom met in practice. To address this limitation, we introduce a novel shielding framework […]

Agentic Authoring of Interactive Multiview Visualizations in Genomics

arXiv:2606.00370v1 Announce Type: cross Abstract: Diverse genomics data, scientific questions, and analysis tasks typically demand highly specialized visualizations. Therefore, users often must customize or author new ones tailored to their data. Existing tools are usually either limited in customization or require substantial learning or programming, and even expressive tools assume visualization expertise many users lack. […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844