arXiv:2508.20262v2 Announce Type: replace
Abstract: Large language models can now generate intermediate reasoning steps before producing answers, improving performance on difficult problems by interactively developing solutions. This study uses a content moderation task to examine parallels between human decision times and model reasoning effort, measured using the length of the chain-of-thought (CoT). Across three frontier models, CoT length consistently predicts human decision time. Moreover, humans took longer and models produced longer CoTs when important variables were held constant, suggesting similar sensitivity to task difficulty. Analyses of the CoT content shows that models reference various contextual factors more frequently when making such decisions. These findings show parallels between human and AI reasoning on practical tasks and underscore the potential of reasoning traces for enhancing interpretability and decision-making.
AI Wrapped: The 14 AI terms you couldn’t avoid in 2025
If the past 12 months have taught us anything, it’s that the AI hype train is showing no signs of slowing. It’s hard to believe



