arXiv:2603.08182v2 Announce Type: replace-cross Abstract: Large language models often underperform in many European languages due to the dominance of English and a few high-resource languages in training data. This paper presents TildeOpen LLM, a 30-billion-parameter open-weight foundational model trained for 34 European languages to promote linguistic equity and improve performance for low-resource languages. To address […]
Random Cloud: Finding Minimal Neural Architectures Without Training
arXiv:2604.26830v1 Announce Type: cross Abstract: I propose the emphRandom Cloud method, a training-free approach to neural architecture search that discovers minimal feedforward network topologies through stochastic exploration and progressive structural reduction. Unlike post-training pruning methods that require a full train-prune-retrain cycle, this method evaluates randomly initialized networks without backpropagation, progressively reduces their topology, and only […]
Training-Free Adaptation of New-Generation LLMs using Legacy Clinical Models
arXiv:2601.03423v3 Announce Type: replace-cross Abstract: Adapting language models to the clinical domain through continued pretraining and instruction tuning requires costly retraining for each new model generation. We propose Cross-Architecture Proxy Tuning (CAPT), a model-ensembling approach that enables training-free adaptation of state-of-the-art general-domain models using existing clinical models. CAPT supports models with disjoint vocabularies, leveraging contrastive […]
Causal Disentanglement for Full-Reference Image Quality Assessment
arXiv:2604.21654v2 Announce Type: replace-cross Abstract: Existing deep network-based full-reference image quality assessment (FR-IQA) models typically work by performing pairwise comparisons of deep features from the reference and distorted images. In this paper, we approach this problem from a different perspective and propose a novel FR-IQA paradigm based on causal inference and decoupled representation learning. Unlike […]
Integrating Weather Foundation Model and Satellite to Enable Fine-Grained Solar Irradiance Forecasting
arXiv:2603.14845v3 Announce Type: replace-cross Abstract: Accurate day-ahead solar irradiance forecasting is essential for integrating solar energy into the power grid. However, it remains challenging due to the pronounced diurnal cycle and inherently complex cloud dynamics. Current methods either lack fine-scale resolution (e.g., numerical weather prediction, weather foundation models) or degrade at longer lead times (e.g., […]
ViCrop-Det: Spatial Attention Entropy Guided Cropping for Training-Free Small-Object Detection
arXiv:2604.26806v1 Announce Type: cross Abstract: Transformer-based architectures have established a dominant paradigm in global semantic perception; however, they remain fundamentally constrained by the profound spatial heterogeneity inherent in natural images. Specifically, the imposition of a uniform global receptive field across regions of varying information density inevitably leads to local feature degradation, particularly in dense conflict […]
Untrained CNNs Match Backpropagation at V1: A Systematic RSA Comparison of Four Learning Rules Against Human fMRI
arXiv:2604.16875v2 Announce Type: replace-cross Abstract: A central question in computational neuroscience is whether the learning rule used to train a neural network determines how well its internal representations align with those of the human visual cortex. We present a systematic comparison of four learning rules (backpropagation (BP), feedback alignment (FA), predictive coding (PC), and spike-timing-dependent […]
Curiosity-Critic: Cumulative Prediction Error Improvement as a Tractable Intrinsic Reward for World Model Training
arXiv:2604.18701v2 Announce Type: replace-cross Abstract: Local prediction-error-based curiosity rewards focus on the current transition without considering the world model’s cumulative prediction error across all visited transitions. We introduce Curiosity-Critic, which grounds its intrinsic reward in the improvement of this cumulative objective, and show that it admits a tractable per-step surrogate: the difference between the current […]
Towards Unified Multi-task EEG Analysis with Low-Rank Adaptation
arXiv:2604.25131v2 Announce Type: replace-cross Abstract: Recent self-supervised pre-training methods for electroencephalogram (EEG) have shown promising results. However, the pre-trained models typically require full fine-tuning on each downstream task individually to achieve good performance. In practical applications involving multiple tasks, utilizing a separate model for each task is not ideal regarding computational and spatial cost. In […]
MemOVCD: Training-Free Open-Vocabulary Change Detection via Cross-Temporal Memory Reasoning and Global-Local Adaptive Rectification
arXiv:2604.26774v1 Announce Type: cross Abstract: Open-vocabulary change detection aims to identify semantic changes in bi-temporal remote sensing images without predefined categories. Recent methods combine foundation models such as SAM, DINO and CLIP, but typically process each timestamp independently or interact only at the final comparison stage. Such paradigms suffer from insufficient temporal coupling during semantic […]
SkillForge: Forging Domain-Specific, Self-Evolving Agent Skills in Cloud Technical Support
arXiv:2604.08618v2 Announce Type: replace-cross Abstract: Deploying LLM-powered agents in enterprise scenarios such as cloud technical support demands high-quality, domain-specific skills. However, existing skill creators lack domain grounding, producing skills poorly aligned with real-world task requirements. Moreover, once deployed, there is no systematic mechanism to trace execution failures back to skill deficiencies and drive targeted refinements, […]
Provable Coordination for LLM Agents via Message Sequence Charts
arXiv:2604.17612v2 Announce Type: replace-cross Abstract: Multi-agent systems built on large language models (LLMs) are difficult to reason about. Coordination errors such as deadlocks or type-mismatched messages are often hard to detect through testing. We introduce a domain-specific language for specifying agent coordination based on message sequence charts (MSCs). The language separates message-passing structure from LLM […]