Learning to lead in a hybrid human-AI enterprise

Learning to lead in a hybrid human-AI enterprise

As adoption of AI agents looks set to surge by as much as 300% in the next two years, leadership teams are carefully considering the implications

David Sinclair plans to test whole-body rejuvenation drugs in the XPrize competition

The outspoken longevity scientist David Sinclair has been predicting that one day, you’ll go to the doctor and get a prescription that will make you

Five things you need to know about AI

At SXSW London last week I gave a talk called “Five things you need to know about AI,” in which I shared what I think

ScaleSweep: Accurate NVFP4 Post-Training Quantization of LLMs via Block Scale Initialization

arXiv:2606.07618v1 Announce Type: cross Abstract: NVFP4 is a recently introduced hardware-supported FP4 format that improves the fidelity of 4-bit quantization through fine-grained block scales. However,

ViMax: Agentic Video Generation

arXiv:2606.07649v1 Announce Type: cross Abstract: Long-form video generation requires systematic narrative planning and visual consistency that current short-clip methods cannot provide. Existing methods generate isolated

Hacking Generative Perplexity: Why Unconditional Text Evaluation Needs Distributional Metrics

June 9, 2026

arXiv:2606.08417v1 Announce Type: cross
Abstract: Diffusion and continuous flow-based language models have emerged as the leading non-autoregressive alternatives to language modeling. Progress in both paradigms is overwhelmingly tracked by generative perplexity (gen-PPL): the per-token negative log-likelihood of samples under a frozen autoregressive (AR) scorer such as gpt2-large, typically paired with an empirical-entropy guardrail to rule out low-entropy collapse. We argue that this metric is unsound. By construction, gen-PPL measures only predictability under the scoring AR, not grammaticality or semantic coherence — and the set of predictable but still low-quality sequences is combinatorially large. To make this concrete, we construct a suite of zero-parameter, deliberately naive samplers that achieve state-of-the-art gen-PPL on LM1B and OpenWebText at non-degenerate entropy, surpassing recently published diffusion and continuous-flow models while producing text that is incoherent by construction. We recommend evaluation suites that directly quantify the distributional divergence between generated and reference text, and use such a suite to re-benchmark recent non-autoregressive models, recovering a more faithful picture of the current state of the art.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844