Audio Deepfake Detection in the Age of Advanced Text-to-Speech models

MobileBench-OL: A Comprehensive Chinese Benchmark for Evaluating Mobile GUI Agents in Real-World Environment

arXiv:2601.20335v1 Announce Type: cross Abstract: Recent advances in mobile Graphical User Interface (GUI) agents highlight the growing need for comprehensive evaluation benchmarks. While new online

On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents

arXiv:2601.20404v1 Announce Type: cross Abstract: AI coding agents such as Codex and Claude Code are increasingly used to autonomously contribute to software repositories. However, little

How AI Impacts Skill Formation

arXiv:2601.20245v1 Announce Type: cross Abstract: AI assistance produces significant productivity gains across professional domains, particularly for novice workers. Yet how this assistance affects the development

Truthfulness Despite Weak Supervision: Evaluating and Training LLMs Using Peer Prediction

arXiv:2601.20299v1 Announce Type: cross Abstract: The evaluation and post-training of large language models (LLMs) rely on supervision, but strong supervision for difficult tasks is often

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

arXiv:2601.20115v1 Announce Type: cross Abstract: Graphics Processing Units (GPUs) are the state-of-the-art architecture for essential tasks, ranging from rendering 2D/3D graphics to accelerating workloads in