Para-B&B: Load-Balanced Deterministic Parallelization of Solving MIP

arXiv:2604.09556v1 Announce Type: cross Abstract: Mixed-integer programming (MIP) extends linear programming by incorporating both continuous and integer decision variables, making it widely used in production planning, logistics scheduling, and resource allocation. However, MIP remains NP-hard and cannot generally be solved to optimality in polynomial time. Branch-and-bound, a fundamental exact method, faces significant parallelization challenges due […]

ACE-TA: An Agentic Teaching Assistant for Grounded Q&A, Quiz Generation, and Code Tutoring

arXiv:2604.09572v1 Announce Type: cross Abstract: We introduce ACE-TA, the Agentic Coding and Explanations Teaching Assistant framework, that autonomously routes conceptual queries drawn from programming course material to grounded Q&A, stepwise coding guidance, and automated quiz generation using pre-trained Large Language Models (LLMs). ACE-TA consists of three coordinated modules: a retrieval grounded conceptual Q&A system that […]

MERMAID: Memory-Enhanced Retrieval and Reasoning with Multi-Agent Iterative Knowledge Grounding for Veracity Assessment

arXiv:2601.22361v2 Announce Type: replace-cross Abstract: Assessing the veracity of online content has become increasingly critical. Large language models (LLMs) have recently enabled substantial progress in automated veracity assessment, including automated fact-checking and claim verification systems. Typical veracity assessment pipelines break down complex claims into sub-claims, retrieve external evidence, and then apply LLM reasoning to assess […]

Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning

arXiv:2604.10701v1 Announce Type: cross Abstract: Credit assignment is a central challenge in reinforcement learning (RL). Classical actor-critic methods address this challenge through fine-grained advantage estimation based on a learned value function. However, learned value models are often avoided in modern large language model (LLM) RL because conventional discriminative critics are difficult to train reliably. We […]

Context Kubernetes: Declarative Orchestration of Enterprise Knowledge for Agentic AI Systems

arXiv:2604.11623v1 Announce Type: new Abstract: We introduce Context Kubernetes, an architecture for orchestrating enterprise knowledge in agentic AI systems, with a prototype implementation and eight experiments. The core observation is that delivering the right knowledge, to the right agent, with the right permissions, at the right freshness — across an entire organization — is structurally […]

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

arXiv:2604.10893v1 Announce Type: cross Abstract: Watermarking provides a critical safeguard for large language model (LLM) services by facilitating the detection of LLM-generated text. Correspondingly, stealing watermark algorithms (SWAs) derive watermark information from watermarked texts generated by victim LLMs to craft highly targeted adversarial attacks, which compromise the reliability of watermarks. Existing SWAs rely on fixed […]

Many-Tier Instruction Hierarchy in LLM Agents

arXiv:2604.09443v2 Announce Type: replace-cross Abstract: Large language model agents receive instructions from many sources-system messages, user prompts, tool outputs, other agents, and more-each carrying different levels of trust and authority. When these instructions conflict, agents must reliably follow the highest-privilege instruction to remain safe and effective. The dominant paradigm, instruction hierarchy (IH), assumes a fixed, […]

LABBench2: An Improved Benchmark for AI Systems Performing Biology Research

arXiv:2604.09554v1 Announce Type: new Abstract: Optimism for accelerating scientific discovery with AI continues to grow. Current applications of AI in scientific research range from training dedicated foundation models on scientific data to agentic autonomous hypothesis generation systems to AI-driven autonomous labs. The need to measure progress of AI systems in scientific domains correspondingly must not […]

DarwinNet: An Evolutionary Network Architecture for Agent-Driven Protocol Synthesis

arXiv:2604.01236v2 Announce Type: replace-cross Abstract: Traditional network architectures suffer from severe protocol ossification and structural fragility due to their reliance on static, human-defined rules that fail to adapt to the emergent edge cases and probabilistic reasoning of modern autonomous agents. To address these limitations, this paper proposes DarwinNet, a bio-inspired, self-evolving network architecture that transitions […]

GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents

arXiv:2603.24329v2 Announce Type: replace-cross Abstract: Multimodal LLMs are increasingly deployed as perceptual backbones for autonomous agents in 3D environments, from robotics to virtual worlds. These applications require agents to perceive rapid state changes, attribute actions to the correct entities, and reason about concurrent multi-agent behaviors from a first-person perspective, capabilities that existing benchmarks do not […]

FACT-E: Causality-Inspired Evaluation for Trustworthy Chain-of-Thought Reasoning

arXiv:2604.10693v1 Announce Type: new Abstract: Chain-of-Thought (CoT) prompting has improved LLM reasoning, but models often generate explanations that appear coherent while containing unfaithful intermediate steps. Existing self-evaluation approaches are prone to inherent biases: the model may confidently endorse coherence even when the step-to-step implication is not valid, leading to unreliable faithfulness evaluation. We propose FACT-E, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844