DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?

arXiv:2604.09251v2 Announce Type: replace Abstract: Deep research agents increasingly interleave web browsing with multi-step computation, yet existing benchmarks evaluate these capabilities in isolation, creating a blind spot in assessing real-world performance. We introduce DRBENCHER, a synthetic benchmark generator for questions that require both browsing and computation. It enforces four criteria: verifiability (gold answers are computed […]

The Economics of p(doom): Scenarios of Existential Risk and Economic Growth in the Age of Transformative AI

arXiv:2503.07341v2 Announce Type: replace-cross Abstract: Recent advances in artificial intelligence (AI) have led to a wide range of predictions about its long-term impact on humanity. A central focus is the potential emergence of transformative AI (TAI), eventually capable of outperforming humans in all economically valuable tasks and fully automating labor. Discussed scenarios range from unprecedented […]

Towards Multimodal Active Learning: Efficient Learning with Limited Paired Data

arXiv:2510.03247v2 Announce Type: replace-cross Abstract: Active learning (AL) is a principled strategy to reduce annotation cost in data-hungry deep learning. However, existing AL algorithms focus almost exclusively on unimodal data, overlooking the substantial annotation burden in multimodal learning. We introduce the first framework for multimodal active learning with unaligned data, where the learner must actively […]

Language-Conditioned Safe Trajectory Generation for Spacecraft Rendezvous

arXiv:2512.09111v3 Announce Type: replace-cross Abstract: Reliable real-time trajectory generation is essential for future autonomous spacecraft. While recent progress in nonconvex guidance and control is paving the way for onboard autonomous trajectory optimization, these methods still rely on extensive expert input (e.g., waypoints, constraints, mission timelines, etc.), which limits operational scalability in complex missions such as […]

Intent Laundering: AI Safety Datasets Are Not What They Seem

arXiv:2602.16729v3 Announce Type: replace-cross Abstract: We systematically evaluate the quality of widely used adversarial safety datasets from two perspectives: in isolation and in practice. In isolation, we examine how well these datasets reflect real-world adversarial attacks based on three defining properties: being driven by ulterior intent, well-crafted, and out-of-distribution. We find that these datasets overrely […]

ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving

arXiv:2604.14626v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) models have become the dominant architecture for large-scale language models, yet on-premises serving remains fundamentally memory-bound as batching turns sparse per-token compute into dense memory activation. Memory-centric architectures (PIM, NMP) improve bandwidth but leave compute underutilized under MoE’s low arithmetic intensity at high batch sizes. Speculative decoding (SD) […]

MOMO: A framework for seamless physical, verbal, and graphical robot skill learning and adaptation

arXiv:2604.20468v2 Announce Type: replace-cross Abstract: Industrial robot applications require increasingly flexible systems that non-expert users can easily adapt for varying tasks and environments. However, different adaptations benefit from different interaction modalities. We present an interactive framework that enables robot skill adaptation through three complementary modalities: kinesthetic touch for precise spatial corrections, natural language for high-level […]

Pre-trained LLMs Meet Sequential Recommenders: Efficient User-Centric Knowledge Distillation

arXiv:2604.21536v1 Announce Type: cross Abstract: Sequential recommender systems have achieved significant success in modeling temporal user behavior but remain limited in capturing rich user semantics beyond interaction patterns. Large Language Models (LLMs) present opportunities to enhance user understanding with their reasoning capabilities, yet existing integration approaches create prohibitive inference costs in real time. To address […]

Promoting Simple Agents: Ensemble Methods for Event-Log Prediction

arXiv:2604.21629v1 Announce Type: cross Abstract: We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, […]

Stealthy Backdoor Attacks against LLMs Based on Natural Style Triggers

arXiv:2604.21700v1 Announce Type: cross Abstract: The growing application of large language models (LLMs) in safety-critical domains has raised urgent concerns about their security. Many recent studies have demonstrated the feasibility of backdoor attacks against LLMs. However, existing methods suffer from three key shortcomings: explicit trigger patterns that compromise naturalness, unreliable injection of attacker-specified payloads in […]

Quotient-Space Diffusion Models

arXiv:2604.21809v1 Announce Type: cross Abstract: Diffusion-based generative models have reformed generative AI, and have enabled new capabilities in the science domain, for example, generating 3D structures of molecules. Due to the intrinsic problem structure of certain tasks, there is often a symmetry in the system, which identifies objects that can be converted by a group […]

TingIS: Real-time Risk Event Discovery from Noisy Customer Incidents at Enterprise Scale

arXiv:2604.21889v1 Announce Type: cross Abstract: Real-time detection and mitigation of technical anomalies are critical for large-scale cloud-native services, where even minutes of downtime can result in massive financial losses and diminished user trust. While customer incidents serve as a vital signal for discovering risks missed by monitoring, extracting actionable intelligence from this data remains challenging […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844