• Home
  • Uncategorized
  • Screen, Cache, and Match: A Training-Free Causality-Consistent Reference Frame Framework for Human Animation

arXiv:2601.22160v2 Announce Type: replace-cross
Abstract: Human animation aims to generate temporally coherent and visually consistent videos over long sequences, yet modeling long-range dependencies while preserving frame quality remains challenging. Inspired by the human ability to leverage past observations for interpreting ongoing actions, we propose FrameCache, a training-free, causality-consistent reference frame framework. FrameCache explicitly converts historical generation results into causal guidance through two complementary mechanisms. First, at the reference level, a novel Screen-Cache-Match (SCM) strategy constructs a dynamic, high-quality reference memory, ensuring motion-consistent appearance guidance to reduce identity drift. Second, at the generative level, a Trajectory-Aware Autoregressive Generation (TAAG) mechanism aligns denoising trajectories across adjacent video chunks. This is achieved through an overlap-aware latent propagation and a dual-domain fusion strategy that seamlessly blends low-frequency structural layouts with high-frequency textural details. Extensive experiments on standard benchmarks demonstrate that FrameCache consistently improves temporal coherence and visual stability while integrating seamlessly with diverse diffusion baselines. Code will be made publicly available.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844