A Variational Latent Equilibrium for Learning in Neuronal Circuits

arXiv:2603.09600v2 Announce Type: replace Abstract: Brains remain unrivaled in their ability to recognize and generate complex spatiotemporal patterns. While AI is able to reproduce some of these capabilities, deep learning algorithms remain largely at odds with our current understanding of brain circuitry and dynamics. This is prominently the case for backpropagation through time (BPTT), the […]

AgentOS: From Application Silos to a Natural Language-Driven Data Ecosystem

arXiv:2603.08938v2 Announce Type: replace Abstract: The rapid emergence of open-source, locally hosted intelligent agents marks a critical inflection point in human-computer interaction. Systems such as OpenClaw demonstrate that Large Language Model (LLM)-based agents can autonomously operate local computing environments, orchestrate workflows, and integrate external tools. However, within the current paradigm, these agents remain conventional applications […]

Human-Centred LLM Privacy Audits: Findings and Frictions

arXiv:2603.12094v1 Announce Type: cross Abstract: Large language models (LLMs) learn statistical associations from massive training corpora and user interactions, and deployed systems can surface or infer information about individuals. Yet people lack practical ways to inspect what a model associates with their name. We report interim findings from an ongoing study and introduce LMP2, a […]

Partially Recentralization Softmax Loss for Vision-Language Models Robustness

arXiv:2402.03627v3 Announce Type: replace-cross Abstract: As Large Language Models make a breakthrough in natural language processing tasks (NLP), multimodal technique becomes extremely popular. However, it has been shown that multimodal NLP are vulnerable to adversarial attacks, where the outputs of a model can be dramatically changed by a perturbation to the input. While several defense […]

Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights

arXiv:2603.12228v1 Announce Type: cross Abstract: Pretraining produces a learned parameter vector that is typically treated as a starting point for further iterative adaptation. In this work, we instead view the outcome of pretraining as a distribution over parameter vectors, whose support already contains task-specific experts. We show that in small models such expert solutions occupy […]

HOG-Diff: Higher-Order Guided Diffusion for Graph Generation

arXiv:2502.04308v3 Announce Type: replace-cross Abstract: Graph generation is a critical yet challenging task, as empirical analyses require a deep understanding of complex, non-Euclidean structures. Diffusion models have recently made significant advances in graph generation, but these models are typically adapted from image generation frameworks and overlook inherent higher-order topology, limiting their ability to capture graph […]

Rethinking the Harmonic Loss via Non-Euclidean Distance Layers

arXiv:2603.10225v2 Announce Type: replace-cross Abstract: Cross-entropy loss has long been the standard choice for training deep neural networks, yet it suffers from interpretability limitations, unbounded weight growth, and inefficiencies that can contribute to costly training dynamics. The harmonic loss is a distance-based alternative grounded in Euclidean geometry that improves interpretability and mitigates phenomena such as […]

ReasonMap: Towards Fine-Grained Visual Reasoning from Transit Maps

arXiv:2505.18675v3 Announce Type: replace-cross Abstract: Multimodal large language models (MLLMs) have demonstrated significant progress in semantic scene understanding and text-image alignment, with reasoning variants enhancing performance on more complex tasks involving mathematics and logic. To bridge this gap, we introduce ReasonMap, a novel benchmark specifically designed to evaluate these capabilities. ReasonMap encompasses high-resolution transit maps […]

Multimodal Emotion Recognition via Bi-directional Cross-Attention and Temporal Modeling

arXiv:2603.11971v1 Announce Type: cross Abstract: Emotion recognition in in-the-wild video data remains a challenging problem due to large variations in facial appearance, head pose, illumination, background noise, and the inherently dynamic nature of human affect. Relying on a single modality, such as facial expressions or speech, is often insufficient to capture these complex emotional cues. […]

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

arXiv:2507.08800v2 Announce Type: replace-cross Abstract: We introduce NeuralOS, a neural framework that simulates graphical user interfaces (GUIs) of operating systems by directly predicting screen frames in response to user inputs such as mouse movements, clicks, and keyboard events. NeuralOS combines a recurrent neural network (RNN), which tracks computer state, with a diffusion-based neural renderer that […]

Efficient Cross-View Localization in 6G Space-Air-Ground Integrated Network

arXiv:2603.11398v1 Announce Type: cross Abstract: Recently, visual localization has become an important supplement to improve localization reliability, and cross-view approaches can greatly enhance coverage and adaptability. Meanwhile, future 6G will enable a globally covered mobile communication system, with a space-air-ground integrated network (SAGIN) serving as key supporting architecture. Inspired by this, we explore an integration […]

Single molecule localization microscopy challenge: a biologically inspired benchmark for long-sequence modeling

arXiv:2603.11296v1 Announce Type: cross Abstract: State space models (SSMs) have recently achieved strong performance on long sequence modeling tasks while offering improved memory and computational efficiency compared to transformer based architectures. However, their evaluation has been largely limited to synthetic benchmarks and application domains such as language and audio, leaving their behavior on sparse and […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844