• Home
  • Uncategorized
  • Hurwitz Quaternion Multiplicative Quantization for KV Cache Compression

arXiv:2605.27646v1 Announce Type: cross
Abstract: We propose textbfHurwitz Quaternion Multiplicative Quantization (HQMQ), a textbfcalibration-free method for KV cache compression of large language models. HQMQ treats each 4-element chunk of K or V as a quaternion and quantizes its unit direction to the emphproduct $q_p cdot q_s$, where $q_p$ ranges over the 24-element Hurwitz group $2T$ (the 24 vertices of the 24-cell on $S^3$, pairwise angle $60^circ$) and $q_s$ ranges over a per-(layer, head) secondary codebook of $S$ emphrandom unit quaternions. The multiplicative composition yields $24S$ effective codewords at $S$ stored parameters; random initialization suffices because left-multiplication is an $S^3$ isometry, so seeded codebooks vary in end-task ppl by $<1.5%$. A per-batch median-multiplier outlier extraction step ($C=3$, no calibration) handles modern outlier-heavy architectures. We evaluate on five modern open models: Mistral-7B (dense MHA), Llama-3-8B and Qwen2.5-7B and Qwen3-8B (dense GQA), and gpt-oss-20b (sparse MoE). On Mistral-7B and Qwen3-8B, HQMQ matches fp16 within $0.02$–$0.03$ ppl points at $sim$5 bits. On Qwen2.5-7B and Qwen3-8B, where naive int4 collapses to $10^4+$ ppl, HQMQ + Med3$times$ recovers fp16 quality within $0.02$–$0.10$ ppl points at $sim$5 bits. HQMQ Pareto-dominates naive int by $3$–$1900times$ at matched bits across all five models, and downstream zero-shot accuracy matches fp16 at $3.79$ bits on Mistral. Against the strongest calibrated KV-quantization baseline, HQMQ at $3.79$ bits matches KIVI-4 ($sim 4.5$ bits) within $sim1$ pt on CoQA, $0.6$ pts on TruthfulQA, and $2.3$ pts on GSM8K, at $16%$ fewer bits and without a calibration pass. At the storage level, HQMQ delivers up to $5.05times$ KV compression, shrinking a Llama-3-70B 128k-context cache from 43 GB to 8.5 GB.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844