ActorMind: Emulating Human Actor Reasoning for Speech Role-Playing

arXiv:2604.11103v2 Announce Type: replace-cross Abstract: Role-playing has garnered rising attention as it provides a strong foundation for human-machine interaction and facilitates sociological research. However, current work is confined to textual modalities, neglecting speech, which plays a predominant role in daily life, thus limiting genuine role-playing. To bridge this gap, we conceptualize and benchmark speech role-playing […]

Beyond Linearity in Attention Projections: The Case for Nonlinear Queries

arXiv:2603.13381v2 Announce Type: replace-cross Abstract: Recent algebraic analysis shows that in decoder-only and encoder-only transformers, the Query projection $W_Q$ may be set to identity without noticeable performance deterioration. This is possible because attention depends on $X$ only through the products $XW_Q, XW_K, XW_V$, allowing basis transformations to be absorbed by adjacent layers and propagated through […]

Rethinking Retrieval-Augmented Generation as a Cooperative Decision-Making Problem

arXiv:2602.18734v2 Announce Type: replace-cross Abstract: Retrieval-Augmented Generation (RAG) has demonstrated strong effectiveness in knowledge-intensive tasks by grounding language generation in external evidence. Despite its success, many existing RAG systems are built based on a ranking-centric, asymmetric dependency paradigm, where the generation quality of the generator is highly dependent on reranking results of the reranker. To […]

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale

arXiv:2604.00813v3 Announce Type: replace-cross Abstract: End-to-end autonomous driving has evolved from the conventional paradigm based on sparse perception into vision-language-action (VLA) models, which focus on learning language descriptions as an auxiliary task to facilitate planning. In this paper, we propose an alternative Vision-Geometry-Action (VGA) paradigm that advocates dense 3D geometry as the critical cue for […]

Bolzano: Case Studies in LLM-Assisted Mathematical Research

arXiv:2604.16989v2 Announce Type: replace-cross Abstract: We report new results on eight problems in mathematics and theoretical computer science, produced with the assistance of Bolzano, an open-source multi-agent LLM system. Bolzano orchestrates rounds of interaction between parallel prover agents and a verifier agent while maintaining a persistent knowledge base that is carried across rounds. Classified using […]

Universal Transformers Need Memory: Depth-State Trade-offs in Adaptive Recursive Reasoning

arXiv:2604.21999v2 Announce Type: cross Abstract: We study learned memory tokens as computational scratchpad for a single-block Universal Transformer (UT) with Adaptive Computation Time (ACT) on Sudoku-Extreme, a combinatorial reasoning benchmark. We find that memory tokens are empirically necessary: across all configurations tested — 3 seeds, multiple token counts, two initialization schemes, ACT and fixed-depth processing […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844