arXiv:2504.17069v2 Announce Type: replace-cross
Abstract: Autoregressive (AR) image generators are becoming increasingly popular due to their ability to produce high-quality images and their scalability. Typical AR models are locked onto a specific generation order, often a raster-scan from top-left to bottom-right; this prohibits multi-task flexibility (inpainting, editing, outpainting) without retraining. Any-order AR models address this by learning to generate under arbitrary patch orderings, but at the cost of increased complexity and lower performance. In this paper, we present Ordered Autoregressive (OAR) generation, a self-distillation pipeline that first trains an any-order AR model, then extracts specialized generation orders from the model’s own confidence scores, and fine-tunes on these orders. This achieves two goals: 1) improved generation quality by redirecting capacity from learning all $N!$ orderings to a single specialized path, and 2) preserved flexibility of any-order models. On ImageNet $256times 256$, OAR improves FID from 2.39 to 2.17 over the any-order baseline, with consistent gains on Fashion Products and CelebA-HQ. OAR supports zero-shot inpainting and outpainting without retraining, and human evaluation shows 64% preference over the baseline. The pipeline requires only lightweight fine-tuning on a pretrained any-order model, with no architectural changes or additional annotations.
Disclosure in the era of generative artificial intelligence
Generative artificial intelligence (AI) has rapidly become embedded in academic writing, assisting with tasks ranging from language editing to drafting text and producing evidence. Despite

