How Blind and Low-Vision Individuals Prefer Large Vision-Language Model-Generated Scene Descriptions

arXiv:2502.14883v3 Announce Type: replace-cross Abstract: For individuals with blindness or low vision (BLV), navigating complex environments can pose serious risks. Large Vision-Language Models (LVLMs) show promise for generating scene descriptions, but their effectiveness for BLV users remains underexplored. To address this gap, we conducted a user study with eight BLV participants to systematically evaluate preferences […]

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

arXiv:2603.11558v3 Announce Type: replace-cross Abstract: Vision-Language-Action (VLA) systems have shown strong potential for language-driven robotic manipulation. However, scaling them to long-horizon tasks remains challenging. Existing pipelines typically separate data collection, policy learning, and deployment, resulting in heavy reliance on manual environment resets and brittle multi-policy execution. We present RoboClaw, an agentic robotics framework that unifies […]

Are Large Vision-Language Models Ready to Guide Blind and Low-Vision Individuals?

arXiv:2510.00766v2 Announce Type: replace-cross Abstract: Large Vision-Language Models (LVLMs) demonstrate a promising direction for assisting individuals with blindness or low-vision (BLV). Yet, measuring their true utility in real-world scenarios is challenging because evaluating whether their descriptions are BLV-informative requires a fundamentally different approach from assessing standard scene descriptions. While the “VLM-as-a-metric” or “LVLM-as-a-judge” paradigm has […]

Internal APIs Are All You Need: Shadow APIs, Shared Discovery, and the Case Against Browser-First Agent Architectures

arXiv:2604.00694v1 Announce Type: cross Abstract: Autonomous agents increasingly interact with the web, yet most websites remain designed for human browsers — a fundamental mismatch that the emerging “Agentic Web” must resolve. Agents must repeatedly browse pages, inspect DOMs, and reverse-engineer callable routes — a process that is slow, brittle, and redundantly repeated across agents. We […]

How Well Do Large-Scale Chemical Language Models Transfer to Downstream Tasks?

arXiv:2602.11618v3 Announce Type: replace-cross Abstract: Chemical Language Models (CLMs) pre-trained on large scale molecular data are widely used for molecular property prediction. However, the common belief that increasing training resources such as model size, dataset size, and training compute improves both pretraining loss and downstream task performance has not been systematically validated in the chemical […]

SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration

arXiv:2603.03823v4 Announce Type: replace-cross Abstract: Large language model (LLM)-powered agents have demonstrated strong capabilities in automating software engineering tasks such as static bug fixing. However, in the real world, the development of mature software is typically predicated on complex requirement changes and long-term feature iterations — a process that static, one-shot repair paradigms fail to […]

Experiential Reflective Learning for Self-Improving LLM Agents

arXiv:2603.24639v2 Announce Type: replace-cross Abstract: Recent advances in large language models (LLMs) have enabled the development of autonomous agents capable of complex reasoning and multi-step problem solving. However, these agents struggle to adapt to specialized environments and do not leverage past interactions, approaching each new task from scratch regardless of their accumulated experience. We introduce […]

Procela: Epistemic Governance in Mechanistic Simulations Under Structural Uncertainty

arXiv:2604.00675v1 Announce Type: cross Abstract: Mechanistic simulations typically assume fixed ontologies: variables, causal relationships, and resolution policies are static. This assumption fails when the true causal structure is contested or unidentifiable-as in antimicrobial resistance (AMR) spread, where contact, environmental, and selection ontologies compete. We introduce Procela, a Python framework where variables act as epistemic authorities […]

Chat-Based Support Alone May Not Be Enough: Comparing Conversational and Embedded LLM Feedback for Mathematical Proof Learning

arXiv:2602.18807v2 Announce Type: replace-cross Abstract: We evaluate GPTutor, an LLM-powered tutoring system for an undergraduate discrete mathematics course. It integrates two LLM-supported tools: a structured proof-review tool that provides embedded feedback on students’ written proof attempts, and a chatbot for math questions. In a staggered-access study with 148 students, earlier access was associated with higher […]

Representation Selection via Cross-Model Agreement using Canonical Correlation Analysis

arXiv:2604.00921v1 Announce Type: cross Abstract: Modern vision pipelines increasingly rely on pretrained image encoders whose representations are reused across tasks and models, yet these representations are often overcomplete and model-specific. We propose a simple, training-free method to improve the efficiency of image representations via a post-hoc canonical correlation analysis (CCA) operator. By leveraging the shared […]

Streaming Model Cascades for Semantic SQL

arXiv:2604.00660v1 Announce Type: cross Abstract: Modern data warehouses extend SQL with semantic operators that invoke large language models on each qualifying row, but the per-row inference cost is prohibitive at scale. Model cascades reduce this cost by routing most rows through a fast proxy model and delegating uncertain cases to an expensive oracle. Existing frameworks, […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844