• Home
  • Uncategorized
  • Time is Not Compute: Scaling Laws for Wall-Clock Constrained Training on Consumer GPUs

arXiv:2603.28823v1 Announce Type: cross
Abstract: Scaling laws relate model quality to compute budget (FLOPs), but practitioners face wall-clock time constraints, not compute budgets. We study optimal model sizing under fixed time budgets from 5 minutes to 24 hours on consumer GPUs (RTX 4090). Across 70+ runs spanning 50M–1031M parameters, we find: (1)~at each time budget a U-shaped curve emerges where too-small models overfit and too-large models undertrain; (2)~optimal model size follows $N^* propto t^0.60$, growing emphfaster than Chinchilla’s $N^* propto C^0.50$, with $alpha = 0.60 pm 0.07$ robustly exceeding compute-optimal across all sensitivity analyses; (3)~a emphdual U-shape mechanism: short-budget U-curves arise from compute bottlenecks, while long-budget U-curves emerge from data bottlenecks (overfitting), with an intermediate regime where the U-curve temporarily disappears. These findings have immediate implications for researchers training on consumer hardware, where wall-clock time — not FLOPs — is the binding constraint. We release all code, logs, and 70+ experimental configurations.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844