Wideband RF Radiance Field Modeling Using Frequency-embedded 3D Gaussian Splatting

arXiv:2505.20714v2 Announce Type: replace-cross Abstract: Indoor environments typically contain diverse RF signals distributed across multiple frequency bands, including NB-IoT, Wi-Fi, and millimeter-wave. Consequently, wideband RF modeling is essential for practical applications such as joint deployment of heterogeneous RF systems, cross-band communication, and distributed RF sensing. Although 3D Gaussian Splatting (3DGS) techniques effectively reconstruct RF radiance […]

CATCODER: Repository-Level Code Generation with Relevant Code and Type Context

arXiv:2406.03283v2 Announce Type: replace-cross Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, repository-level code generation presents unique challenges, particularly due to the need to utilize information spread across multiple files within a repository. Specifically, successful generation depends on a solid grasp of both general, context-agnostic knowledge and specific, context-dependent […]

LLM-DSE: Searching Accelerator Parameters with LLM Agents

arXiv:2505.12188v3 Announce Type: replace-cross Abstract: Even though high-level synthesis (HLS) tools mitigate the challenges of programming domain-specific accelerators (DSAs) by raising the abstraction level, optimizing hardware directive parameters remains a significant hurdle. Existing heuristic and learning-based methods struggle with adaptability and sample efficiency. We present LLM-DSE, a multi-agent framework designed specifically for optimizing HLS directives. […]

Enhancing Quranic Learning: A Multimodal Deep Learning Approach for Arabic Phoneme Recognition

arXiv:2511.17477v1 Announce Type: cross Abstract: Recent advances in multimodal deep learning have greatly enhanced the capability of systems for speech analysis and pronunciation assessment. Accurate pronunciation detection remains a key challenge in Arabic, particularly in the context of Quranic recitation, where subtle phonetic differences can alter meaning. Addressing this challenge, the present study proposes a […]

Can AI Perceive Physical Danger and Intervene?

arXiv:2509.21651v2 Announce Type: replace Abstract: When AI interacts with the physical world — as a robot or an assistive agent — new safety challenges emerge beyond those of purely “digital AI”. In such interactions, the potential for physical harm is direct and immediate. How well do state-of-the-art foundation models understand common-sense facts about physical safety, […]

Sparse Mixture-of-Experts for Multi-Channel Imaging: Are All Channel Interactions Required?

arXiv:2511.17400v1 Announce Type: cross Abstract: Vision Transformers ($textViTs$) have become the backbone of vision foundation models, yet their optimization for multi-channel domains – such as cell painting or satellite imagery – remains underexplored. A key challenge in these domains is capturing interactions between channels, as each channel carries different information. While existing works have shown […]

VLA-Pruner: Temporal-Aware Dual-Level Visual Token Pruning for Efficient Vision-Language-Action Inference

arXiv:2511.16449v2 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) models have shown great promise for embodied AI, yet the heavy computational cost of processing continuous visual streams severely limits their real-time deployment. Token pruning (keeping salient visual tokens and dropping redundant ones) has emerged as an effective approach for accelerating Vision-Language Models (VLMs), offering a solution for […]

WER is Unaware: Assessing How ASR Errors Distort Clinical Understanding in Patient Facing Dialogue

arXiv:2511.16544v2 Announce Type: replace-cross Abstract: As Automatic Speech Recognition (ASR) is increasingly deployed in clinical dialogue, standard evaluations still rely heavily on Word Error Rate (WER). This paper challenges that standard, investigating whether WER or other common metrics correlate with the clinical impact of transcription errors. We establish a gold-standard benchmark by having expert clinicians […]

Labeled histories and maximally probable labeled topologies with multifurcation

arXiv:2511.16799v1 Announce Type: new Abstract: In mathematical phylogenetics, labeled histories describe the sequences by which sets of labeled lineages coalesce to a shared ancestral lineage. We study labeled histories for at-most-$r$-furcating trees. Consider a rooted leaf-labeled tree in which internal nodes each have $i$ offspring, and $i$ is permitted to range from 2 to $r$ […]

MF-GCN: A Multi-Frequency Graph Convolutional Network for Tri-Modal Depression Detection Using Eye-Tracking, Facial, and Acoustic Features

arXiv:2511.15675v2 Announce Type: replace-cross Abstract: Depression is a prevalent global mental health disorder, characterised by persistent low mood and anhedonia. However, it remains underdiagnosed because current diagnostic methods depend heavily on subjective clinical assessments. To enable objective detection, we introduce a gold standard dataset of 103 clinically assessed participants collected through a tripartite data approach […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844