• Home
  • Uncategorized
  • EduIllustrate: Towards Scalable Automated Generation Of Multimodal Educational Content

arXiv:2604.05005v2 Announce Type: replace-cross
Abstract: Large language models are increasingly used as educational assistants, yet evaluation of their educational capabilities remains concentrated on question-answering and tutoring tasks. A critical gap exists for multimedia instructional content generation — the ability to produce coherent, diagram-rich explanations that combine geometrically accurate visuals with step-by-step reasoning. We present EduIllustrate, a benchmark for evaluating LLMs on interleaved text-diagram explanation generation for K-12 STEM problems. The benchmark comprises 230 problems spanning five subjects and three grade levels, a standardized generation protocol with sequential anchoring to enforce cross-diagram visual consistency, and an 8-dimension evaluation rubric grounded in multimedia learning theory covering both text and visual quality. Evaluation of ten LLMs reveals a wide performance spread: Gemini 3.0 Pro Preview leads at 87.8%, while Kimi-K2.5 achieves the best cost-efficiency (80.8% at \$0.12/problem). Workflow ablation confirms sequential anchoring improves Visual Consistency by 13% at 94% lower cost. Human evaluation with 20 expert raters validates LLM-as-judge reliability for objective dimensions ($rho geq 0.83$) while revealing limitations on subjective visual assessment.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844