AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

arXiv:2603.28696v1 Announce Type: cross Abstract: Long video understanding remains challenging for Multi-modal Large Language Models (MLLMs) due to high memory costs and context-length limits. Prior approaches mitigate this by scoring and selecting frames/tokens within short clips, but they lack a principled mechanism to (i) compare relevance across distant video clips and (ii) stop processing once […]

AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems

arXiv:2602.11510v2 Announce Type: replace Abstract: Multi-agent Large Language Model (LLM) systems create privacy risks that current benchmarks cannot measure. When agents coordinate on tasks, sensitive data passes through inter-agent messages, shared memory, and tool arguments, all pathways that output-only audits never inspect. We introduce AgentLeak, to the best of our knowledge the first full-stack benchmark […]

Class-Imbalanced-Aware Adaptive Dataset Distillation for Scalable Pretrained Model on Credit Scoring

arXiv:2501.10677v3 Announce Type: replace-cross Abstract: The advent of artificial intelligence has significantly enhanced credit scoring technologies. Despite the remarkable efficacy of advanced deep learning models, mainstream adoption continues to favor tree-structured models due to their robust predictive performance on tabular data. Although pretrained models have seen considerable development, their application within the financial realm predominantly […]

Generating Findings for Jaw Cysts in Dental Panoramic Radiographs Using a GPT-Based VLM: A Preliminary Study on Building a Two-Stage Self-Correction Loop with Structured Output (SLSO) Framework

arXiv:2510.02001v5 Announce Type: replace-cross Abstract: Vision-language models (VLMs) such as GPT (Generative Pre-Trained Transformer) have shown potential for medical image interpretation; however, challenges remain in generating reliable radiological findings in clinical practice, as exemplified by dental pathologies. This study proposes a Self-correction Loop with Structured Output (SLSO) framework as an integrated processing methodology to enhance […]

Single-Round Scalable Analytic Federated Learning

arXiv:2512.03336v2 Announce Type: replace-cross Abstract: Federated Learning (FL) is plagued by two key challenges: high communication overhead and performance collapse on heterogeneous (non-IID) data. Analytic FL (AFL) provides a single-round, data distribution invariant solution, but is limited to linear models. Subsequent non-linear approaches, like DeepAFL, regain accuracy but sacrifice the single-round benefit. In this work, […]

Symphonym: Universal Phonetic Embeddings for Cross-Script Name Matching

arXiv:2601.06932v4 Announce Type: replace-cross Abstract: Matching place names across writing systems is a persistent obstacle to the integration of multilingual geographic sources, whether modern gazetteers, medieval itineraries, or colonial-era surveys. Existing approaches depend on language-specific phonetic algorithms or romanisation steps that discard phonetic information, and none generalises across script boundaries. This paper presents Symphonym, a […]

Declarative Scenario-based Testing with RoadLogic

arXiv:2603.09455v2 Announce Type: replace-cross Abstract: Scenario-based testing is a key method for cost-effective and safe validation of autonomous vehicles (AVs). Existing approaches rely on imperative scenario definitions, requiring developers to manually enumerate numerous variants to achieve coverage. Declarative languages, such as ASAM OpenSCENARIO DSL (OS2), raise the abstraction level but lack systematic methods for instantiating […]

Vega: Learning to Drive with Natural Language Instructions

arXiv:2603.25741v2 Announce Type: replace-cross Abstract: Vision-language-action models have reshaped autonomous driving to incorporate languages into the decision-making process. However, most existing pipelines only utilize the language modality for scene descriptions or reasoning and lack the flexibility to follow diverse user instructions for personalized driving. To address this, we first construct a large-scale driving dataset (InstructScene) […]

Designing AI for Real Users — Accessibility Gaps in Retail AI Front-End

arXiv:2603.28196v1 Announce Type: cross Abstract: As AI becomes embedded in customer-facing systems, ethical scrutiny has largely focused on models, data, and governance. Far less attention has been paid to how AI is experienced through user-facing design. This commentary argues that many AI front-ends implicitly assume an ‘ideal user body and mind’, and that this becomes […]

Integrating Multimodal Large Language Model Knowledge into Amodal Completion

arXiv:2603.28333v1 Announce Type: cross Abstract: With the widespread adoption of autonomous vehicles and robotics, amodal completion, which reconstructs the occluded parts of people and objects in an image, has become increasingly crucial. Just as humans infer hidden regions based on prior experience and common sense, this task inherently requires physical knowledge about real-world entities. However, […]

Will a time-varying complex system be stable?

arXiv:2603.28464v1 Announce Type: cross Abstract: Randomly-assembled dynamical systems are theoretically predicted to be unstable upon crossing a critical threshold of complexity, as first shown by May. Yet, empirical complex systems exhibit remarkable stability, indicating the presence of additional mechanisms playing a stabilizing role. The relation between complexity and stability is typically assessed by assuming fixed […]

Moving Beyond Review: Applying Language Models to Planning and Translation in Reflection

arXiv:2603.28596v1 Announce Type: cross Abstract: Reflective writing is known to support the development of students’ metacognitive skills, yet learners often struggle to engage in deep reflection, limiting learning gains. Although large language models (LLMs) have been shown to improve writing skills, their use as conversational agents for reflective writing has produced mixed results and has […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844