Revisiting Multimodal KV Cache Compression: A Frequency-Domain-Guided Outlier-KV-Aware Approach

arXiv:2511.16786v1 Announce Type: cross Abstract: Multimodal large language models suffer from substantial inference overhead since multimodal KV Cache grows proportionally with the visual input length. Existing multimodal KV Cache compression methods mostly rely on attention score to reduce cache size, which makes them are incompatible with established efficient attention kernels (e.g., FlashAttention) and ignores the […]

Fairness Evaluation of Large Language Models in Academic Library Reference Services

arXiv:2507.04224v3 Announce Type: replace-cross Abstract: As libraries explore large language models (LLMs) for use in virtual reference services, a key question arises: Can LLMs serve all users equitably, regardless of demographics or social status? While they offer great potential for scalable support, LLMs may also reproduce societal biases embedded in their training data, risking the […]

Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

arXiv:2511.13646v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are typically equipped with a suite of coding tools and can autonomously decide the next actions to form complete […]

ResearStudio: A Human-Intervenable Framework for Building Controllable Deep-Research Agents

arXiv:2510.12194v2 Announce Type: replace Abstract: Current deep-research agents run in a ”fire-and-forget” mode: once started, they give users no way to fix errors or add expert knowledge during execution. We present ResearStudio, the first open-source framework that places real-time human control at its core. The system follows a Collaborative Workshop design. A hierarchical Planner-Executor writes […]

SRA-CP: Spontaneous Risk-Aware Selective Cooperative Perception

arXiv:2511.17461v1 Announce Type: new Abstract: Cooperative perception (CP) offers significant potential to overcome the limitations of single-vehicle sensing by enabling information sharing among connected vehicles (CVs). However, existing generic CP approaches need to transmit large volumes of perception data that are irrelevant to the driving safety, exceeding available communication bandwidth. Moreover, most CP frameworks rely […]

Monte Carlo Expected Threat (MOCET) Scoring

arXiv:2511.16823v1 Announce Type: cross Abstract: Evaluating and measuring AI Safety Level (ASL) threats are crucial for guiding stakeholders to implement safeguards that keep risks within acceptable limits. ASL-3+ models present a unique risk in their ability to uplift novice non-state actors, especially in the realm of biosecurity. Existing evaluation metrics, such as LAB-Bench, BioLP-bench, and […]

A Robust Federated Learning Approach for Combating Attacks Against IoT Systems Under non-IID Challenges

arXiv:2511.16822v1 Announce Type: cross Abstract: In the context of the growing proliferation of user devices and the concurrent surge in data volumes, the complexities arising from the substantial increase in data have posed formidable challenges to conventional machine learning model training. Particularly, this is evident within resource-constrained and security-sensitive environments such as those encountered in […]

WorldGen: From Text to Traversable and Interactive 3D Worlds

arXiv:2511.16825v1 Announce Type: cross Abstract: We introduce WorldGen, a system that enables the automatic creation of large-scale, interactive 3D worlds directly from text prompts. Our approach transforms natural language descriptions into traversable, fully textured environments that can be immediately explored or edited within standard game engines. By combining LLM-driven scene layout reasoning, procedural generation, diffusion-based […]

MIR: Efficient Exploration in Episodic Multi-Agent Reinforcement Learning via Mutual Intrinsic Reward

arXiv:2511.17165v1 Announce Type: new Abstract: Episodic rewards present a significant challenge in reinforcement learning. While intrinsic reward methods have demonstrated effectiveness in single-agent rein-forcement learning scenarios, their application to multi-agent reinforcement learn-ing (MARL) remains problematic. The primary difficulties stem from two fac-tors: (1) the exponential sparsity of joint action trajectories that lead to rewards as […]

OmniGround: A Comprehensive Spatio-Temporal Grounding Benchmark for Real-World Complex Scenarios

arXiv:2511.16937v1 Announce Type: cross Abstract: Spatio-Temporal Video Grounding (STVG) aims to localize target objects in videos based on natural language descriptions. Despite recent advances in Multimodal Large Language Models, a significant gap remains between current models and real-world demands involving diverse objects and complex queries. We attribute this to limited benchmark scope, causing models to […]

The use of vocal biomarkers in the detection of Parkinson’s disease: a robust statistical performance comparison of classic machine learning models

arXiv:2511.16856v1 Announce Type: cross Abstract: Parkinson’s disease (PD) is a progressive neurodegenerative disorder that, in addition to directly impairing functional mobility, is frequently associated with vocal impairments such as hypophonia and dysarthria, which typically manifest in the early stages. The use of vocal biomarkers to support the early diagnosis of PD presents a non-invasive, low-cost, […]

MedImageInsight for Thoracic Cavity Health Classification from Chest X-rays

arXiv:2511.17043v1 Announce Type: cross Abstract: Chest radiography remains one of the most widely used imaging modalities for thoracic diagnosis, yet increasing imaging volumes and radiologist workload continue to challenge timely interpretation. In this work, we investigate the use of MedImageInsight, a medical imaging foundational model, for automated binary classification of chest X-rays into Normal and […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844