arXiv:2601.21049v1 Announce Type: new
Abstract: User queries in real-world retrieval are often non-faithful (noisy, incomplete, or distorted), causing retrievers to fail when key semantics are missing. We formalize this as retrieval under recall noise, where the observed query is drawn from a noisy recall process of a latent target item. To address this, we propose QUARK, a simple yet effective training-free framework for robust retrieval under non-faithful queries. QUARK explicitly models query uncertainty through recovery hypotheses, i.e., multiple plausible interpretations of the latent intent given the observed query, and introduces query-anchored aggregation to combine their signals robustly. The original query serves as a semantic anchor, while recovery hypotheses provide controlled auxiliary evidence, preventing semantic drift and hypothesis hijacking. This design enables QUARK to improve recall and ranking quality without sacrificing robustness, even when some hypotheses are noisy or uninformative. Across controlled simulations and BEIR benchmarks (FIQA, SciFact, NFCorpus) with both sparse and dense retrievers, QUARK improves Recall, MRR, and nDCG over the base retriever. Ablations show QUARK is robust to the number of recovery hypotheses and that anchored aggregation outperforms unanchored max/mean/median pooling. These results demonstrate that modeling query uncertainty through recovery hypotheses, coupled with principled anchored aggregation, is essential for robust retrieval under non-faithful queries.
How the sometimes-weird world of lifespan extension is gaining influence
For the last couple of years, I’ve been following the progress of a group of individuals who believe death is humanity’s “core problem.” Put simply,


