Translating AI research into reality: summary of the 2025 voice AI Symposium and Hackathon

The 2025 Voice AI Symposium represented a transition from conceptual research to clinical implementation in vocal biomarker science. Hosted by the NIH-funded Bridge2AI-Voice consortium, the

Decoding perceived risks in online healthcare services: a safety–trust model based on grounded theory

IntroductionThe rapid rise of online healthcare services (OHSs) in China has improved access to medical information and services while creating new uncertainties related to quality,

Anonymization, accountability, and access: legal dimensions of health data sharing in federated networks. Perspectives from empirical study

This paper explores the perspectives of stakeholders involved in federated networks for health data sharing, focusing on the legal and practical dimensions of data protection

AI-enabled cardiovascular devices: a lifecycle playbook for evidence, change control, and post-market assurance

AI-enabled cardiovascular devices are increasingly used in imaging, physiological signal analysis, and clinical decision support systems. Despite growing clinical adoption, requirements for evidence generation, software

From bedside to bytes: the digital transformation of the healthcare workforce

Digital transformation is reshaping healthcare work, whereas research on workforce implications remains fragmented across disciplines. Effects like burnout, resistance, and workflow disruption are often framed

RobotArena $infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

March 16, 2026

arXiv:2510.23571v2 Announce Type: replace-cross
Abstract: The pursuit of robot generalists, agents capable of performing diverse tasks across diverse environments, demands rigorous and scalable evaluation. Yet real-world testing of robot policies remains fundamentally constrained: it is labor-intensive, slow, unsafe at scale, and difficult to reproduce. As policies expand in scope and complexity, these barriers only intensify, since defining “success” in robotics often hinges on nuanced human judgments of execution quality. We introduce RobotArena Infinity, a new benchmarking framework that overcomes these challenges by shifting vision-language-action (VLA) evaluation into large-scale simulated environments augmented with online human feedback. Leveraging advances in vision-language models, 2D-to-3D generative modeling, and differentiable rendering, our approach automatically converts video demonstrations from widely used robot datasets into simulated counterparts. Within these digital twins, we assess VLA policies using both automated vision-language-model-guided scoring and scalable human preference judgments collected from crowdworkers, transforming human involvement from tedious scene setup, resetting, and safety supervision into lightweight preference comparisons. To measure robustness, we systematically perturb simulated environments along multiple axes, including textures and object placements, stress-testing policy generalization under controlled variation. The result is a continuously evolving, reproducible, and scalable benchmark for real-world-trained robot manipulation policies, addressing a critical missing capability in today’s robotics landscape.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844