COTCAgent: Preventive Consultation via Probabilistic Chain-of-Thought Completion

arXiv:2605.15016v1 Announce Type: cross Abstract: As large language models empower healthcare, intelligent clinical decision support has developed rapidly. Longitudinal electronic health records (EHR) provide essential temporal evidence for accurate clinical diagnosis and analysis. However, current large language models have critical flaws in longitudinal EHR reasoning. First, lacking fine-grained statistical reasoning, they often hallucinate clinical trends […]

Understanding How International Students in the U.S. Are Using Conversational AI to Support Cross-Cultural Adaptation

arXiv:2605.15127v1 Announce Type: cross Abstract: Moving to a new culture and adapting to a new life, as an international student, can be a stressful experience. In the US, international students face unique overlapping challenges, yet the current support ecosystem, including university support systems and informal social networks, remains largely fragmented. While conversational AI has emerged […]

GeoLaux: A Benchmark for Evaluating MLLMs’ Geometry Performance on Long-Step Problems Requiring Auxiliary Lines

arXiv:2508.06226v4 Announce Type: replace Abstract: Geometry problem solving (GPS) poses significant challenges for Multimodal Large Language Models (MLLMs) in diagram comprehension, knowledge application, long-step reasoning, and auxiliary line construction. However, current benchmarks lack fine-grained evaluation for long-step problems necessitating auxiliary construction. To address these limitations, we present GeoLaux, a fine-grained annotated dataset comprising 2186 calculation […]

Proactive Memory for Ad-Hoc Recall over Streaming Dialogues

arXiv:2603.04885v2 Announce Type: replace Abstract: Real-world dialogue usually unfolds as an infinite stream. It thus requires bounded-state memory mechanisms to operate within an infinite horizon. However, existing read-then-think memory is fundamentally misaligned with this setting, as it cannot support ad-hoc memory recall while streams unfold. To explore this challenge, we introduce textbfSTEM-Bench, the first benchmark […]

Autofocus Retrieval: An Effective Pipeline for Multi-Hop Question Answering With Semi-Structured Knowledge

arXiv:2505.09246v4 Announce Type: replace-cross Abstract: In many real-world settings, machine learning models and interactive systems have access to both structured knowledge, e.g., knowledge graphs or tables, and unstructured content, e.g., natural language documents. Yet, most rely on either. Semi-Structured Knowledge Bases (SKBs) bridge this gap by linking unstructured content to nodes within structured data. In […]

Higher-order Linear Attention

arXiv:2510.27258v3 Announce Type: replace-cross Abstract: The quadratic cost of scaled dot-product attention is a central obstacle to scaling autoregressive language models to long contexts. Linear-time attention and State Space Models (SSMs) provide scalable alternatives but are typically restricted to first-order or kernel-based approximations, which can limit expressivity. We introduce Higher-order Linear Attention (HLA), a causal, […]

L2R: Low-Rank and Lipschitz-Controlled Routing for Mixture-of-Experts

arXiv:2601.21349v2 Announce Type: replace-cross Abstract: Mixture-of-Experts (MoE) models scale neural networks by conditionally activating a small subset of experts, where the router plays a central role in determining expert specialization and overall model performance. However, many modern MoE systems still adopt linear routers in raw high-dimensional representation spaces, where representation mismatch, angular concentration, and scale-sensitive […]

Decoupling Stability and Plasticity for Multi-Modal Test-Time Adaptation

arXiv:2603.00574v2 Announce Type: replace-cross Abstract: Adapting pretrained multi-modal models to evolving test-time distributions, known as multi-modal test-time adaptation, presents a significant challenge. Existing methods frequently encounter negative transfer in the unbiased modality and catastrophic forgetting in the biased modality. To address these challenges, we propose Decoupling Adaptation for Stability and Plasticity (DASP), a novel diagnose-then-mitigate […]

ECHO: Elastic Speculative Decoding with Sparse Gating for High-Concurrency Scenarios

arXiv:2604.09603v2 Announce Type: replace-cross Abstract: Speculative Decoding promises to accelerate the inference of Large Language Models, yet its efficacy often degrades in production-grade serving. Existing evaluations typically overlook the compute-bound nature of high-concurrency regimes, where verification compute becomes the dominant bottleneck. Consequently, prior methods face a dilemma: static trees incur massive verification waste, while dynamic […]

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

arXiv:2605.13851v1 Announce Type: new Abstract: Multi-agent orchestration — in which a hidden coordinator manages specialized worker agents — is becoming the default architecture for enterprise AI deployment, yet the safety implications of orchestrator invisibility have never been empirically tested. We conducted a preregistered 3×2 experiment (365 runs, 5 agents per run) crossing three organizational structures […]

From Sycophantic Consensus to Pluralistic Repair: Why AI Alignment Must Surface Disagreement

arXiv:2605.14912v1 Announce Type: new Abstract: Pluralistic alignment is typically operationalised as preference aggregation: producing responses that span (Overton), steer toward (Steerable), or proportionally represent (Distributional) diverse human values. We argue that aggregation alone is an incomplete primitive for deployed pluralistic alignment. Under genuine value pluralism, the failure mode of contemporary RLHF-trained assistants is not insufficient […]

A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Function and Execution Topology

arXiv:2605.13850v1 Announce Type: new Abstract: Existing frameworks for LLM-based agent architectures describe systems from a single perspective: industry guides (Anthropic, Google, LangChain) focus on execution topology — how data flows — while cognitive science surveys focus on cognitive function — what the agent does. Neither axis alone disambiguates architecturally distinct systems: the same Orchestrator-Workers topology […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844