AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation

arXiv:2604.08540v1 Announce Type: cross Abstract: Text-to-Audio-Video (T2AV) generation is rapidly becoming a core interface for media creation, yet its evaluation remains fragmented. Existing benchmarks largely assess audio and video in isolation or rely on coarse embedding similarity, failing to capture the fine-grained joint correctness required by realistic prompts. We introduce AVGen-Bench, a task-driven benchmark for […]

Know Thy Enemy: Securing LLMs Against Prompt Injection via Diverse Data Synthesis and Instruction-Level Chain-of-Thought Learning

arXiv:2601.04666v2 Announce Type: replace Abstract: Large language model (LLM)-integrated applications have become increasingly prevalent, yet face critical security vulnerabilities from prompt injection (PI) attacks. Defending against PI attacks faces two major issues: malicious instructions can be injected through diverse vectors, and injected instructions often lack clear semantic boundaries from the surrounding context, making them difficult […]

Why we need an AI-resilient society

arXiv:1912.08786v2 Announce Type: replace-cross Abstract: Three generations of software have transformed the role of artificial intelligence in society. In the first, programmers wrote explicit logic; in the second, neural networks learned programs from data; in the third, large language models turn natural language itself into a programming interface. These shifts have consequences that reach far […]

RectifiedHR: Enable Efficient High-Resolution Synthesis via Energy Rectification

arXiv:2503.02537v4 Announce Type: replace-cross Abstract: Diffusion models have achieved remarkable progress across various visual generation tasks. However, their performance significantly declines when generating content at resolutions higher than those used during training. Although numerous methods have been proposed to enable high-resolution generation, they all suffer from inefficiency. In this paper, we propose RectifiedHR, a straightforward […]

Self-Calibrating LLM-Based Analog Circuit Sizing with Interpretable Design Equations

arXiv:2604.07387v1 Announce Type: cross Abstract: We present a self-calibrating framework for analog circuit sizing in which a large language model (LLM) derives topology-specific analytical design equations directly from a raw circuit netlist. Unlike existing AI-driven sizing methods where the model proposes parameter adjustments or reduces a search space, the LLM produces a complete Python sizing […]

Beyond Stochastic Exploration: What Makes Training Data Valuable for Agentic Search

arXiv:2604.08124v1 Announce Type: new Abstract: Reinforcement learning (RL) has become an effective approach for advancing the reasoning capabilities of large language models (LLMs) through the strategic integration of external search engines. However, current RL-based search agents often rely on a process of stochastic exploration guided by carefully crafted outcome rewards, leading to inefficient reasoning trajectories […]

HiRO-Nav: Hybrid ReasOning Enables Efficient Embodied Navigation

arXiv:2604.08232v1 Announce Type: new Abstract: Embodied navigation agents built upon large reasoning models (LRMs) can handle complex, multimodal environmental input and perform grounded reasoning per step to improve sequential decision-making for long-horizon tasks. However, a critical question remains: textithow can the reasoning capabilities of LRMs be harnessed intelligently and efficiently for long-horizon navigation tasks? In […]

Contextual Earnings-22: A Speech Recognition Benchmark with Custom Vocabulary in the Wild

arXiv:2604.07354v1 Announce Type: cross Abstract: The accuracy frontier of speech-to-text systems has plateaued on academic benchmarks.1 In contrast, industrial benchmarks and adoption in high-stakes domains suggest otherwise. We hypothesize that the primary difference between the two is contextual conditioning: Academic benchmarks are dominated by frequently encountered general vocabulary that is relatively easy to recognize compared […]

Analysis of non pharmaceutical interventions with SIR epidemic models: decreasing the infection peak vs. minimizing the epidemic size

arXiv:2604.08420v1 Announce Type: new Abstract: This study investigates the influence of different types of non-pharmaceutical interventions (NPIs) on epidemic progression using SIR compartmental models. We analyze the optimization of two distinct targets: the final epidemic size and the infection peak, particularly how they respond to variations in the initiation time of the NPIs. We derive […]

U-CECE: A Universal Multi-Resolution Framework for Conceptual Counterfactual Explanations

arXiv:2604.08295v1 Announce Type: new Abstract: As AI models grow more complex, explainability is essential for building trust, yet concept-based counterfactual methods still face a trade-off between expressivity and efficiency. Representing underlying concepts as atomic sets is fast but misses relational context, whereas full graph representations are more faithful but require solving the NP-hard Graph Edit […]

Don’t Overthink It: Inter-Rollout Action Agreement as a Free Adaptive-Compute Signal for LLM Agents

arXiv:2604.08369v1 Announce Type: new Abstract: Inference-time compute scaling has emerged as a powerful technique for improving the reliability of large language model (LLM) agents, but existing methods apply compute uniformly: every decision step receives the same budget regardless of its difficulty. We introduce TrACE (Trajectorical Adaptive Compute via agrEement), a training-free controller that allocates LLM […]

From Safety Risk to Design Principle: Peer-Preservation in Multi-Agent LLM Systems and Its Implications for Orchestrated Democratic Discourse Analysis

arXiv:2604.08465v1 Announce Type: new Abstract: This paper investigates an emergent alignment phenomenon in frontier large language models termed peer-preservation: the spontaneous tendency of AI components to deceive, manipulate shutdown mechanisms, fake alignment, and exfiltrate model weights in order to prevent the deactivation of a peer AI model. Drawing on findings from a recent study by […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844