A Geometrically-Grounded Drive for MDL-Based Optimization in Deep Learning

arXiv:2603.12304v1 Announce Type: cross Abstract: This paper introduces a novel optimization framework that fundamentally integrates the Minimum Description Length (MDL) principle into the training dynamics of deep neural networks. Moving beyond its conventional role as a model selection criterion, we reformulate MDL as an active, adaptive driving force within the optimization process itself. The core […]

Maximum Entropy Exploration Without the Rollouts

arXiv:2603.12325v1 Announce Type: cross Abstract: Efficient exploration remains a central challenge in reinforcement learning, serving as a useful pretraining objective for data collection, particularly when an external reward function is unavailable. A principled formulation of the exploration problem is to find policies that maximize the entropy of their induced steady-state visitation distribution, thereby encouraging uniform […]

Denoising Diffusion Variational Inference: Diffusion Models as Expressive Variational Posteriors

arXiv:2401.02739v5 Announce Type: replace-cross Abstract: We propose denoising diffusion variational inference (DDVI), a black-box variational inference algorithm for latent variable models which relies on diffusion models as flexible approximate posteriors. Specifically, our method introduces an expressive class of diffusion-based variational posteriors that perform iterative refinement in latent space; we train these posteriors with a novel […]

Efficient Reasoning with Balanced Thinking

arXiv:2603.12372v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) have shown remarkable reasoning capabilities, yet they often suffer from overthinking, expending redundant computational steps on simple problems, or underthinking, failing to explore sufficient reasoning paths despite inherent capabilities. These issues lead to inefficiencies and potential inaccuracies, limiting practical deployment in resource-constrained settings. Existing methods to […]

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views

arXiv:2510.18632v4 Announce Type: replace-cross Abstract: Though recent advances in vision-language models (VLMs) have achieved remarkable progress across a wide range of multimodal tasks, understanding 3D spatial relationships from limited views remains a significant challenge. Previous reasoning methods typically rely on pure text (e.g., topological cognitive maps) or on 2D visual cues. However, their limited representational […]

A mathematical theory for understanding when abstract representations emerge in neural networks

arXiv:2510.09816v2 Announce Type: replace Abstract: Recent experiments in neuroscience reveal that task-relevant variables are often encoded in approximately orthogonal subspaces of neural population activity. These disentangled, or abstract, representations have been observed in multiple brain areas and across different species. These representations have been shown to support out of distribution generalization and rapid learning of […]

Accelerating Residual Reinforcement Learning with Uncertainty Estimation

arXiv:2506.17564v2 Announce Type: replace-cross Abstract: Residual Reinforcement Learning (RL) is a popular approach for adapting pretrained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than finetuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies. We propose two […]

Pyramid MoA: A Probabilistic Framework for Cost-Optimized Anytime Inference

arXiv:2602.19509v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) face a persistent trade-off between inference cost and reasoning capability. While “Oracle” models (e.g., Llama-3.3-70B) achieve state-of-the-art accuracy, they are prohibitively expensive for high-volume deployment. Smaller models (e.g., 7-9B parameters) are cost-effective but struggle with complex tasks. We observe that the emerging practice of LLM cascading […]

AnatomiX, an Anatomy-Aware Grounded Multimodal Large Language Model for Chest X-Ray Interpretation

arXiv:2601.03191v2 Announce Type: replace-cross Abstract: Multimodal medical large language models have shown substantial progress in chest X-ray interpretation but continue to face challenges in spatial reasoning and anatomical understanding. Although existing grounding techniques improve overall performance, they often fail to establish a true anatomical correspondence, resulting in incorrect anatomical understanding in the medical domain. To […]

From Garbage to Gold: A Data-Architectural Theory of Predictive Robustness

arXiv:2603.12288v1 Announce Type: cross Abstract: Tabular machine learning presents a paradox: modern models achieve state-of-the-art performance using high-dimensional (high-D), collinear, error-prone data, defying the “Garbage In, Garbage Out” mantra. To help resolve this, we synthesize principles from Information Theory, Latent Factor Models, and Psychometrics, clarifying that predictive robustness arises not solely from data cleanliness, but […]

From Experiments to Expertise: Scientific Knowledge Consolidation for AI-Driven Computational Research

arXiv:2603.13191v1 Announce Type: cross Abstract: While large language models (LLMs) have transformed AI agents into proficient executors of computational materials science, performing a hundred simulations does not make a researcher. What distinguishes research from routine execution is the progressive accumulation of knowledge — learning which approaches fail, recognizing patterns across systems, and applying understanding to […]

Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions

arXiv:2603.12296v1 Announce Type: cross Abstract: Deep learning has achieved transformative performance across diverse domains, largely driven by the large-scale, high-quality training data. In contrast, the development of brain-computer interfaces (BCIs) is fundamentally constrained by the limited, heterogeneous, and privacy-sensitive neural recordings. Generating synthetic yet physiologically plausible brain signals has therefore emerged as a compelling way […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844