One Token Is Enough: Improving Diffusion Language Models with a Sink Token

arXiv:2601.19657v1 Announce Type: cross Abstract: Diffusion Language Models (DLMs) have emerged as a compelling alternative to autoregressive approaches, enabling parallel text generation with competitive performance. Despite these advantages, there is a critical instability in DLMs: the moving sink phenomenon. Our analysis indicates that sink tokens exhibit low-norm representations in the Transformer’s value space, and that […]

Toward Learning POMDPs Beyond Full-Rank Actions and State Observability

arXiv:2601.18930v1 Announce Type: cross Abstract: We are interested in enabling autonomous agents to learn and reason about systems with hidden states, such as furniture with hidden locking mechanisms. We cast this problem as learning the parameters of a discrete Partially Observable Markov Decision Process (POMDP). The agent begins with knowledge of the POMDP’s actions and […]

Beyond the Prompt: An Empirical Study of Cursor Rules

arXiv:2512.18925v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) have demonstrated remarkable capabilities, research shows that their effectiveness depends not only on explicit prompts but also on the broader context provided. This requirement is especially pronounced in software engineering, where the goals, architecture, and collaborative conventions of an existing project play critical roles in […]

Benchmarks Saturate When The Model Gets Smarter Than The Judge

arXiv:2601.19532v1 Announce Type: new Abstract: Benchmarks are important tools to track progress in the development of Large Language Models (LLMs), yet inaccuracies in datasets and evaluation methods consistently undermine their effectiveness. Here, we present Omni-MATH-2, a manually revised version of the Omni-MATH dataset comprising a clean, exact-answer subset ($n=4181$) and a tagged, non-standard subset ($n=247$). […]

Robustness of Constraint Automata for Description Logics with Concrete Domains

arXiv:2601.19644v1 Announce Type: cross Abstract: Decidability or complexity issues about the consistency problem for description logics with concrete domains have already been analysed with tableaux-based or type elimination methods. Concrete domains in ontologies are essential to consider concrete objects and predefined relations. In this work, we expose an automata-based approach leading to the optimal upper […]

Long-term evolution of regulatory DNA sequences. Part 1: Simulations on global, biophysically-realistic genotype-phenotype maps

arXiv:2601.19681v1 Announce Type: new Abstract: Promoters and enhancers are cis-regulatory elements (CREs), DNA sequences that bind transcription factor (TF) proteins to up- or down-regulate target genes. Decades-long efforts yielded TF-DNA interaction models that predict how strongly an individual TF binds arbitrary DNA sequences and how individual binding events on the CRE combine to affect gene […]

Coupled Variational Reinforcement Learning for Language Model General Reasoning

arXiv:2512.12576v2 Announce Type: replace-cross Abstract: While reinforcement learning has achieved impressive progress in language model reasoning, it is constrained by the requirement for verifiable rewards. Recent verifier-free RL methods address this limitation by utilizing the probabilities that LLMs generate reference answers as reward signals. However, these approaches typically sample reasoning traces conditioned only on the […]

Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention

arXiv:2506.00452v4 Announce Type: replace-cross Abstract: In orthogonal frequency division multiplexing (OFDM), accurate channel estimation is crucial. Classical signal processing-based approaches, such as linear minimum mean-squared error (LMMSE) estimation, often require second-order statistics that are difficult to obtain in practice. Recent deep neural network (DNN)-based methods have been introduced to address this; yet they often suffer […]

Artificial Neural Network in Cosmic Landscape

arXiv:1707.02800v2 Announce Type: cross Abstract: In this paper we propose that artificial neural network, the basis of machine learning, is useful to generate the inflationary landscape from a cosmological point of view. Traditional numerical simulations of a global cosmic landscape typically need an exponential complexity when the number of fields is large. However, a basic […]

LangForce: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

arXiv:2601.15197v4 Announce Type: replace Abstract: Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in current training paradigms where goal-driven data collection creates a dataset bias. In such datasets, language instructions are highly predictable from visual observations alone, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844