Enhancing Geo-localization for Crowdsourced Flood Imagery via LLM-Guided Attention

arXiv:2512.11811v3 Announce Type: replace-cross Abstract: Crowdsourced social media imagery provides real-time visual evidence of urban flooding but often lacks reliable geographic metadata for emergency response. Existing Visual Place Recognition (VPR) models struggle to geo-localize these images due to cross-source domain shifts and visual distortions. We present VPR-AttLLM, a model-agnostic framework integrating the semantic reasoning and […]

Semantic-Geometric Dual Compression: Training-Free Visual Token Reduction for Ultra-High-Resolution Remote Sensing Understanding

arXiv:2604.11122v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have demonstrated immense potential in Earth observation. However, the massive visual tokens generated when processing Ultra-High-Resolution (UHR) imagery introduce prohibitive computational overhead, severely bottlenecking their inference efficiency. Existing visual token compression methods predominantly adopt static and uniform compression strategies, neglecting the inherent “Semantic-Geometric Duality” in […]

Emulating Non-Differentiable Metrics via Knowledge-Guided Learning: Introducing the Minkowski Image Loss

arXiv:2604.11422v1 Announce Type: cross Abstract: The “differentiability gap” presents a primary bottleneck in Earth system deep learning: since models cannot be trained directly on non-differentiable scientific metrics and must rely on smooth proxies (e.g., MSE), they often fail to capture high-frequency details, yielding “blurry” outputs. We develop a framework that bridges this gap using two […]

NetworkNet: A Deep Neural Network Approach for Random Networks with Sparse Nodal Attributes and Complex Nodal Heterogeneity

arXiv:2604.11673v1 Announce Type: cross Abstract: Heterogeneous network data with rich nodal information become increasingly prevalent across multidisciplinary research, yet accurately modeling complex nodal heterogeneity and simultaneously selecting influential nodal attributes remains an open challenge. This problem is central to many applications in economics and sociology, when both nodal heterogeneity and high-dimensional individual characteristics highly affect […]

SHE: Stepwise Hybrid Examination Reinforcement Learning Framework for E-commerce Search Relevance

arXiv:2510.07972v3 Announce Type: replace Abstract: Query-product relevance prediction is vital for AI-driven e-commerce, yet current LLM-based approaches face a dilemma: SFT and DPO struggle with long-tail generalization due to coarse supervision, while traditional RLVR suffers from sparse feedback that fails to correct intermediate reasoning errors. We propose Stepwise Hybrid Examination (SHE), an RL framework that […]

The physical basis of information flow in neural matter: a thermocoherent perspective on cognitive dynamics

arXiv:2604.04069v2 Announce Type: replace Abstract: Information flow is central to contemporary accounts of cognition, yet its physical basis in living neural matter remains poorly specified. Here, we develop a multiscale resource-theoretical framework motivated by the textitthermocoherent effect, where heat flow is reciprocally coupled to a delocalized information flow carried by shared coherence and not reducible […]

Curriculum-based Sample Efficient Reinforcement Learning for Robust Stabilization of a Quadrotor

arXiv:2501.18490v3 Announce Type: replace-cross Abstract: This article introduces a novel sample-efficient curriculum learning (CL) approach for training an end-to-end reinforcement learning (RL) policy for robust stabilization of a Quadrotor. The learning objective is to simultaneously stabilize position and yaw-orientation from random initial conditions through direct control over motor RPMs (end-to-end), while adhering to pre-specified transient […]

AdvDINO: Domain-Adversarial Self-Supervised Representation Learning for Spatial Proteomics

arXiv:2508.04955v2 Announce Type: replace-cross Abstract: Self-supervised learning (SSL) has emerged as a powerful approach for learning visual representations without manual annotations. However, the robustness of standard SSL methods to domain shift — systematic differences across data sources — remains uncertain, posing an especially critical challenge in biomedical imaging where batch effects can obscure true biological […]

DistDF: Time-Series Forecasting Needs Joint-Distribution Wasserstein Alignment

arXiv:2510.24574v2 Announce Type: replace-cross Abstract: Training time-series forecasting models requires aligning the conditional distribution of model forecasts with that of the label sequence. The standard direct forecast (DF) approach resorts to minimizing the conditional negative log-likelihood, typically estimated by the mean squared error. However, this estimation proves biased when the label sequence exhibits autocorrelation. In […]

Controlling Multimodal Conversational Agents with Coverage-Enhanced Latent Actions

arXiv:2601.07516v2 Announce Type: replace-cross Abstract: Vision-language models are increasingly employed as multimodal conversational agents (MCAs) for diverse conversational tasks. Recently, reinforcement learning (RL) has been widely explored for adapting MCAs to various human-AI interaction scenarios. Despite showing great enhancement in generalization performance, fine-tuning MCAs via RL still faces challenges in handling the extremely large text […]

Persistent Identity in AI Agents: A Multi-Anchor Architecture for Resilient Memory and Continuity

arXiv:2604.09588v1 Announce Type: new Abstract: Modern AI agents suffer from a fundamental identity problem: when context windows overflow and conversation histories are summarized, agents experience catastrophic forgetting — losing not just information, but continuity of self. This technical limitation reflects a deeper architectural flaw: AI agent identity is centralized in a single memory store, creating […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844