arXiv:2601.19657v1 Announce Type: cross Abstract: Diffusion Language Models (DLMs) have emerged as a compelling alternative to autoregressive approaches, enabling parallel text generation with competitive performance. Despite these advantages, there is a critical instability in DLMs: the moving sink phenomenon. Our analysis indicates that sink tokens exhibit low-norm representations in the Transformer’s value space, and that […]
Toward Learning POMDPs Beyond Full-Rank Actions and State Observability
arXiv:2601.18930v1 Announce Type: cross Abstract: We are interested in enabling autonomous agents to learn and reason about systems with hidden states, such as furniture with hidden locking mechanisms. We cast this problem as learning the parameters of a discrete Partially Observable Markov Decision Process (POMDP). The agent begins with knowledge of the POMDP’s actions and […]
Beyond the Prompt: An Empirical Study of Cursor Rules
arXiv:2512.18925v2 Announce Type: replace-cross Abstract: While Large Language Models (LLMs) have demonstrated remarkable capabilities, research shows that their effectiveness depends not only on explicit prompts but also on the broader context provided. This requirement is especially pronounced in software engineering, where the goals, architecture, and collaborative conventions of an existing project play critical roles in […]
Benchmarks Saturate When The Model Gets Smarter Than The Judge
arXiv:2601.19532v1 Announce Type: new Abstract: Benchmarks are important tools to track progress in the development of Large Language Models (LLMs), yet inaccuracies in datasets and evaluation methods consistently undermine their effectiveness. Here, we present Omni-MATH-2, a manually revised version of the Omni-MATH dataset comprising a clean, exact-answer subset ($n=4181$) and a tagged, non-standard subset ($n=247$). […]
Robustness of Constraint Automata for Description Logics with Concrete Domains
arXiv:2601.19644v1 Announce Type: cross Abstract: Decidability or complexity issues about the consistency problem for description logics with concrete domains have already been analysed with tableaux-based or type elimination methods. Concrete domains in ontologies are essential to consider concrete objects and predefined relations. In this work, we expose an automata-based approach leading to the optimal upper […]
Long-term evolution of regulatory DNA sequences. Part 1: Simulations on global, biophysically-realistic genotype-phenotype maps
arXiv:2601.19681v1 Announce Type: new Abstract: Promoters and enhancers are cis-regulatory elements (CREs), DNA sequences that bind transcription factor (TF) proteins to up- or down-regulate target genes. Decades-long efforts yielded TF-DNA interaction models that predict how strongly an individual TF binds arbitrary DNA sequences and how individual binding events on the CRE combine to affect gene […]
Coupled Variational Reinforcement Learning for Language Model General Reasoning
arXiv:2512.12576v2 Announce Type: replace-cross Abstract: While reinforcement learning has achieved impressive progress in language model reasoning, it is constrained by the requirement for verifiable rewards. Recent verifier-free RL methods address this limitation by utilizing the probabilities that LLMs generate reference answers as reward signals. However, these approaches typically sample reasoning traces conditioned only on the […]
An Interpretable Recommendation Model for Psychometric Data, With an Application to Gerontological Primary Care
arXiv:2601.19824v1 Announce Type: new Abstract: There are challenges that must be overcome to make recommender systems useful in healthcare settings. The reasons are varied: the lack of publicly available clinical data, the difficulty that users may have in understanding the reasons why a recommendation was made, the risks that may be involved in following that […]
Attention-Aided MMSE for OFDM Channel Estimation: Learning Linear Filters with Attention
arXiv:2506.00452v4 Announce Type: replace-cross Abstract: In orthogonal frequency division multiplexing (OFDM), accurate channel estimation is crucial. Classical signal processing-based approaches, such as linear minimum mean-squared error (LMMSE) estimation, often require second-order statistics that are difficult to obtain in practice. Recent deep neural network (DNN)-based methods have been introduced to address this; yet they often suffer […]
Artificial Neural Network in Cosmic Landscape
arXiv:1707.02800v2 Announce Type: cross Abstract: In this paper we propose that artificial neural network, the basis of machine learning, is useful to generate the inflationary landscape from a cosmological point of view. Traditional numerical simulations of a global cosmic landscape typically need an exponential complexity when the number of fields is large. However, a basic […]
LangForce: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries
arXiv:2601.15197v4 Announce Type: replace Abstract: Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in current training paradigms where goal-driven data collection creates a dataset bias. In such datasets, language instructions are highly predictable from visual observations alone, […]
Lossy Image Compression — A Frequent Sequence Mining perspective employing efficient Clustering
arXiv:2601.18821v1 Announce Type: cross Abstract: This work explores the scope of Frequent Sequence Mining in the domain of Lossy Image Compression. The proposed work is based on the idea of clustering pixels and using the cluster identifiers in the compression. The DCT phase in JPEG is replaced with a combination of closed frequent sequence mining […]