arXiv:2603.29890v1 Announce Type: cross
Abstract: Large language models (LLMs) have shown strong performance on standardized social science instruments, but their value for product discovery remains unclear. We investigate whether interview-informed generative agents can simulate user responses in concept testing scenarios. Using in-depth workflow interviews with knowledge workers, we created personalized agents and compared their evaluations of novel AI concepts against the same participants’ responses. Our results show that agents are distribution-calibrated but identity-imprecise: they fail to replicate the specific individual they are grounded in, yet approximate population-level response distributions. These findings highlight both the potential and the limits of LLM simulation in design research. While unsuitable as a substitute for individual-level insights, simulation may provide value for early-stage concept screening and iteration, where distributional accuracy suffices. We discuss implications for integrating simulation responsibly into product development workflows.
TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization
arXiv:2604.07553v1 Announce Type: cross Abstract: This study presents a framework for generating the gold-standard summary fully automatically and reproducibly based on multiple human summaries of


