MobileBench-OL: A Comprehensive Chinese Benchmark for Evaluating Mobile GUI Agents in Real-World Environment

Meet the Vitalists: the hardcore longevity enthusiasts who believe death is “wrong”

“Who here believes involuntary death is a good thing?” Nathan Cheng has been delivering similar versions of this speech over the last couple of years,

Audio Deepfake Detection in the Age of Advanced Text-to-Speech models

arXiv:2601.20510v1 Announce Type: cross Abstract: Recent advances in Text-to-Speech (TTS) systems have substantially increased the realism of synthetic speech, raising new challenges for audio deepfake

On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents

arXiv:2601.20404v1 Announce Type: cross Abstract: AI coding agents such as Codex and Claude Code are increasingly used to autonomously contribute to software repositories. However, little

How AI Impacts Skill Formation

arXiv:2601.20245v1 Announce Type: cross Abstract: AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development

Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction

arXiv:2601.20299v1 Announce Type: cross Abstract: The evaluation and post-training of large language models (LLMs) rely on supervision, but strong supervision for difficult tasks is often