Faithfulness-QA: A Counterfactual Entity Substitution Dataset for Training Context-Faithful RAG Models

Behavior change beyond intervention: an activity-theoretical perspective on human-centered design of personal health technology

IntroductionModern personal technologies, such as smartphone apps with artificial intelligence (AI) capabilities, have a significant potential for helping people make necessary changes in their behavior

Single-dose rabies vaccine shows strong protection in trial

Oxford researchers report a single-dose shot shows a strong immune response against a deadly viral infection in adults and children, raising hopes for simpler global

ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents

arXiv:2412.13682v5 Announce Type: replace Abstract: Travel planning stands out among real-world applications of emphLanguage Agents because it couples significant practical demand with a rigorous constraint-satisfaction

CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation

arXiv:2604.26288v1 Announce Type: cross Abstract: Chest X-ray interpretation is one of the most frequently performed diagnostic tasks in medicine and a primary target for AI

Auto-ARGUE: LLM-Based Report Generation Evaluation

arXiv:2509.26184v5 Announce Type: replace-cross Abstract: Generation of citation-backed reports is a primary use case for retrieval-augmented generation (RAG) systems. While open-source evaluation tools exist for