arXiv:2604.03533v1 Announce Type: new
Abstract: We present an automated crosswalk framework that compares an AI safety policy document pair under a shared taxonomy of activities. Using the activity categories defined in Activity Map on AI Safety as fixed aspects, the system extracts and maps relevant activities, then produces for each aspect a short summary for each document, a brief comparison, and a similarity score. We assess the stability and validity of LLM-based crosswalk analysis across public policy documents. Using five large language models, we perform crosswalks on ten publicly available documents and visualize mean similarity scores with a heatmap. The results show that model choice substantially affects the crosswalk outcomes, and that some document pairs yield high disagreements across models. A human evaluation by three experts on two document pairs shows high inter-annotator agreement, while model scores still differ from human judgments. These findings support comparative inspection of policy documents.
TR-EduVSum: A Turkish-Focused Dataset and Consensus Framework for Educational Video Summarization
arXiv:2604.07553v1 Announce Type: cross Abstract: This study presents a framework for generating the gold-standard summary fully automatically and reproducibly based on multiple human summaries of

