SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs

arXiv:2603.12382v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have advanced from image-level reasoning to pixel-level grounding, but extending these capabilities to videos remains challenging as models must achieve spatial precision and temporally consistent reference tracking. Existing video MLLMs often rely on a static segmentation token ([SEG]) for frame-wise grounding, which provides semantics but […]

CLARE: Classification-based Regression for Electron Temperature Prediction

arXiv:2603.12470v1 Announce Type: cross Abstract: Electron temperature (Te) is an important parameter governing space weather in the upper atmosphere, but has historically been underexplored in the space weather machine learning literature. We present CLARE, a machine learning model for predicting electron temperature in the Earth’s plasmasphere trained on AKEBONO (EXOS-D) satellite measurements as well as […]

MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction

arXiv:2602.23228v2 Announce Type: replace-cross Abstract: With the explosive growth of digital entertainment, automated video summarization has become indispensable for applications such as content indexing, personalized recommendation, and efficient media archiving. Automatic synopsis generation for long-form videos, such as movies and TV series, presents a significant challenge for existing Vision-Language Models (VLMs). While proficient at single-image […]

Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

arXiv:2601.08697v4 Announce Type: replace-cross Abstract: As generative AI becomes embedded in higher education, it increasingly shapes how students complete academic tasks. While these systems offer efficiency and support, concerns persist regarding over-automation, diminished student agency, and the potential for unreliable or hallucinated outputs. This study conducts a mixed-methods audit of student-AI collaboration preferences by examining […]

A Tutorial on Cognitive Biases in Agentic AI-Driven 6G Autonomous Networks

arXiv:2510.19973v3 Announce Type: replace-cross Abstract: The path to higher network autonomy in 6G lies beyond the mere optimization of key performance indicators (KPIs). While KPIs have enabled automation gains under TM Forum Levels 1–3, they remain numerical abstractions that act only as proxies for the real essence of communication networks: seamless connectivity, fairness, adaptability, and […]

From Video to EEG: Adapting Joint Embedding Predictive Architecture to Uncover Saptiotemporal Dynamics in Brain Signal Analysis

arXiv:2507.03633v5 Announce Type: replace-cross Abstract: EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces. However, effective analysis is hindered by limited labeled data, high dimensionality, and the absence of scalable models that fully capture spatiotemporal dependencies. Existing self-supervised learning (SSL) methods […]

Partially Recentralization Softmax Loss for Vision-Language Models Robustness

arXiv:2402.03627v4 Announce Type: replace-cross Abstract: As Large Language Models make a breakthrough in natural language processing tasks (NLP), multimodal technique becomes extremely popular. However, it has been shown that multimodal NLP are vulnerable to adversarial attacks, where the outputs of a model can be dramatically changed by a perturbation to the input. While several defense […]

Transferable Graph Learning for Transmission Congestion Management via Busbar Splitting

arXiv:2510.20591v2 Announce Type: replace Abstract: Network topology optimization (NTO) via busbar splitting can mitigate transmission grid congestion and reduce redispatch costs. However, solving this mixed-integer nonlinear problem for large-scale systems in near-real-time is currently intractable with existing solvers. Machine learning (ML) approaches have emerged as a promising alternative, but they have limited generalization to unseen […]

Proof-Carrying Materials: Falsifiable Safety Certificates for Machine-Learned Interatomic Potentials

arXiv:2603.12183v2 Announce Type: replace-cross Abstract: Machine-learned interatomic potentials (MLIPs) are deployed for high-throughput materials screening without formal reliability guarantees. We show that a single MLIP used as a stability filter misses 93% of density functional theory (DFT)-stable materials (recall 0.07) on a 25,000-material benchmark. Proof-Carrying Materials (PCM) closes this gap through three stages: adversarial falsification […]

BoSS: A Best-of-Strategies Selector as an Oracle for Deep Active Learning

arXiv:2603.13109v1 Announce Type: cross Abstract: Active learning (AL) aims to reduce annotation costs while maximizing model performance by iteratively selecting valuable instances. While foundation models have made it easier to identify these instances, existing selection strategies still lack robustness across different models, annotation budgets, and datasets. To highlight the potential weaknesses of existing AL strategies […]

Deep Learning for Blood-Brain Barrier Permeability Prediction: From Discriminative Models to Mechanism-Aware Design

arXiv:2507.18557v4 Announce Type: replace Abstract: Predicting whether a molecule can cross the blood-brain barrier (BBB) is a key step in early-stage neuro-pharmaceutical design, directly influencing the efficiency and success rate of drug development. Traditional methods based on physicochemical properties are prone to systematic misjudgements due to their reliance on previous empirical evidence. Early machine learning […]

Building Effective AI Coding Agents for the Terminal: Scaffolding, Harness, Context Engineering, and Lessons Learned

arXiv:2603.05344v3 Announce Type: replace Abstract: The landscape of AI coding assistance is undergoing a fundamental shift from complex IDE plugins to versatile, terminal-native agents. Operating directly where developers manage source control, execute builds, and deploy environments, CLI-based agents offer unprecedented autonomy for long-horizon development tasks. In this paper, we present OPENDEV, an open-source, command-line coding […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844