arXiv:2605.27365v2 Announce Type: replace-cross Abstract: Vision-language models (VLMs) commonly formulate visual grounding and detection as a coordinate-token generation problem, serializing each 2D box into multiple 1D tokens that are learned and decoded largely independently. This token-by-token decoding mismatches the coupled structure of box geometry and creates a practical inference bottleneck due to strictly sequential generation. […]
Identifying and Understanding Human Values in Text: A Tailorable LLM-based Architecture
arXiv:2605.27373v1 Announce Type: new Abstract: As intelligent systems become more autonomous, the scientific community focuses on creating decision-making mechanisms that include ethical and moral considerations, unlike traditional utility-maximisation models. To achieve this, a key aspect is assessing how well these decisions align with human values. To this end, a promising line of research is centred […]
GradientStabilizer:Fix the Norm, Not the Gradient
arXiv:2502.17055v4 Announce Type: replace-cross Abstract: Training instability in modern deep learning systems is frequently triggered by rare but extreme gradient-norm spikes, which can induce oversized parameter updates, corrupt optimizer state, and lead to slow recovery or divergence. Widely used safeguards such as gradient clipping mitigate these failures but require threshold tuning and indiscriminately truncate large […]
Unified Panoramic Geometry Estimation via Multi-View Foundation Models
arXiv:2605.26368v2 Announce Type: replace-cross Abstract: Geometry estimation from perspective images has greatly advanced, maturing to the point where off-the-shelf foundation models are able to reconstruct 3D scene structure not only from multi-view imagery, but even from a single view. A natural extension is 3D reconstruction from panoramas, with the exciting prospect of recovering a full […]
Snowveil: A Framework for Decentralised Preference Discovery
arXiv:2512.18444v2 Announce Type: replace-cross Abstract: Aggregating subjective preferences in social choice traditionally assumes a trusted central authority. In contrast, this paper formalises Decentralised Preference Discovery (DPD): the reliable identification of a social choice parameter (e.g. the canonical outcome of an aggregation rule applied to the global preference profile) under conditions of partial information, asynchronous interaction, […]
An Evolutionary Approach for Designing Stable and Highly Expressible Low-Immunogenicity Therapeutic mRNA Sequences
arXiv:2605.27986v1 Announce Type: cross Abstract: Messenger RNA (mRNA) sequences as therapeutics require optimized design to ensure efficient translation, structural stability, and minimal immunogenicity. This study presents a two-stage in-silico framework that integrates deep learning and evolutionary computation for rational mRNA optimization instead of existing state-of-the-art models. In the first stage, a pretrained CodonTransformer (BERT-like Large […]
Adapting, Fast and Slow: On Few-Shot Transportability of Compositions
arXiv:2512.22777v2 Announce Type: replace-cross Abstract: Generalization across domains requires stable structure that links the source and target distributions. Building on causal transportability theory, we study a sequential prediction setting in which the target predictor can be represented as a circuit composed of causal mechanisms that are learnable from source data. We introduce two classes of […]
Max-Window Scale Estimation for Near-Lossless HiF8 W8A8 Quantization-Aware Training
arXiv:2605.26189v2 Announce Type: replace-cross Abstract: Quantization-aware training (QAT) with low-bit floating-point formats enables efficient LLM deployment, yet introduces subtle failure modes invisible to standard training metrics. We present a systematic study of HiF8 W8A8 QAT for OpenPangu-Embedded-1B through the lens of Delayed Tensor Scaling (DTS). Across eight controlled experiments, we identify and disentangle two orthogonal […]
Quantifying the Reconstructability of Astrophysical Methods with Large Language Models and Information Theory: A Case Study in Spectral Reconstruction
arXiv:2605.11154v2 Announce Type: replace-cross Abstract: Modern astrophysical studies rely heavily on complex data analysis pipelines; however, published descriptions often lack the detail required for computational reproducibility. In this work, we present an information-theoretic framework to quantify how effectively a method can be reconstructed from its written description. By treating algorithmic reconstruction as a probability distribution […]
KVoiceBench, KOpenAudioBench, and KMMAU: Agent-Driven Korean Speech Benchmarks for Evaluating SpeechLMs
arXiv:2605.27984v1 Announce Type: cross Abstract: Speech language models (SpeechLMs) have achieved substantial progress by extending large language models (LLMs) to the speech modality. However, SpeechLM evaluation remains heavily centered on English, limiting reliable assessment of multilingual speech capabilities. Straightforward benchmark transfer through ASR, translation, normalization, and TTS can corrupt language-specific instructions, answer constraints, and spoken […]
EigeNet: Geometry-Informed Multi-Modal Learning for Few-shot Novel View RIR Prediction
arXiv:2605.28101v1 Announce Type: cross Abstract: Predicting spatially varying Room Impulse Response (RIR) from sparse observations is a critical but highly challenging inverse problem for immersive spatial audio rendering. In this work, we present EIGENET, a geometry-informed multi-modal framework for few-shot novel view RIR prediction. At its core is a Cross-view Alternate-attention Transformer that iteratively refines […]
Do Language Models Need Sleep? Offline Recurrence for Improved Online Inference
arXiv:2605.26099v2 Announce Type: replace-cross Abstract: Transformer-based large language models are increasingly used for long-horizon tasks; however, their attention mechanism scales poorly with context length. To handle this, we study a sleep-like consolidation mechanism in which a model periodically converts recent context into persistent fast weights before clearing its key-value cache. During sleep, the model performs […]