Optimizer-Model Consistency: Full Finetuning with the Same Optimizer as Pretraining Forgets Less

arXiv:2605.06654v1 Announce Type: cross Abstract: Optimizers play an important role in both pretraining and finetuning stages when training large language models (LLMs). In this paper, we present an observation that full finetuning with the same optimizer as in pretraining achieves a better learning-forgetting tradeoff, i.e., forgetting less while achieving the same or better performance on […]

Latent Generative Solvers for Generalizable Long-Term Physics Simulation

arXiv:2602.11229v2 Announce Type: replace Abstract: Reliable physics simulation demands two capabilities that today’s neural PDE solvers do not deliver together: generalization across heterogeneous PDE families, and stability under long autoregressive rollouts. Deterministic operators accumulate error geometrically, while existing probabilistic solvers are confined to a single PDE family or short horizons. We close this gap with […]

A minimal compact description of the diversity index polytope

arXiv:2409.15641v2 Announce Type: replace-cross Abstract: A phylogenetic tree is an edge-weighted binary tree, with leaves labelled by a collection of species, that represents the evolutionary relationships between those species. For such a tree, a phylogenetic diversity index is a function that apportions the biodiversity of the collection across its constituent species. The diversity index polytope […]

A Practitioner’s Guide to Kolmogorov-Arnold Networks

arXiv:2510.25781v5 Announce Type: replace-cross Abstract: Kolmogorov-Arnold Networks (KANs), whose design is inspired-rather than dictated-by the Kolmogorov superposition theorem, have emerged as a structured alternative to MLPs. This review provides a systematic and comprehensive overview of the rapidly expanding KAN literature. The review is organized around three core themes: (i) clarifying the relationships between KANs and […]

MEMSAD: Gradient-Coupled Anomaly Detection for Memory Poisoning in Retrieval-Augmented Agents

arXiv:2605.03482v2 Announce Type: replace-cross Abstract: Persistent external memory enables LLM agents to maintain context across sessions, yet its security properties remain formally uncharacterized. We formalize memory poisoning attacks on retrieval-augmented agents as a Stackelberg game with a unified evaluation framework spanning three attack classes with escalating access assumptions. Correcting an evaluation protocol inconsistency in the […]

Taming the Entropy Cliff: Variable Codebook Size Quantization for Autoregressive Visual Generation

arXiv:2605.06207v1 Announce Type: cross Abstract: Most discrete visual tokenizers rely on a default design: every position in the sequence shares the same codebook. Researchers try to scale the codebook size $K$ to get better reconstruction performance. Such a constant-codebook design hits a fundamental information-theoretic limit. We observe that the per-position conditional entropy of the training […]

A Topological Sorting Criterion for Random Causal Directed Acyclic Graphs

arXiv:2605.06288v1 Announce Type: cross Abstract: Random directed acyclic graphs (DAGs) based on imposing an order on ErdHos-R’enyi and scale free random graphs are widely used for evaluating causal discovery algorithms. We show that in such DAGs, the set of nodes reachable via open paths, termed relatives, increases monotonically along the causal order. We assess the […]

Flow Matching with Arbitrary Auxiliary Paths

arXiv:2605.06364v1 Announce Type: cross Abstract: We introduce a new generative modeling framework, textbfFlow Matching with Arbitrary Auxiliary Paths (AuxPath-FM), which generalizes conditional flow matching by incorporating an auxiliary variable drawn from an arbitrary distribution into the probability path. Unlike prior methods that restrict auxiliary components to Gaussian noise, AuxPath-FM allows the variable $eta$ to follow […]

PACZero: PAC-Private Fine-Tuning of Language Models via Sign Quantization

arXiv:2605.06505v1 Announce Type: cross Abstract: We introduce PACZero, a family of PAC-private zeroth-order mechanisms for fine-tuning large language models that delivers usable utility at $I(S^*; Y_1:T)=0$. This privacy regime bounds the membership-inference attack (MIA) posterior success rate at the prior, an MIA-resistance level the DP framework matches only at $varepsilon=0$ and infinite noise. All DP-ZO […]

From History to State: Constant-Context Skill Learning for LLM Agents

arXiv:2605.05413v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly used to operate browsers, files, code and tools, making personal assistants a natural deployment target. Yet personal agents face a privacy-cost-capability tension: cloud models execute multi-step workflows well but expose sensitive intermediate context to external APIs, while local models preserve privacy but remain […]

Probabilistic Dating of Historical Manuscripts via Evidential Deep Regression on Visual Script Features

arXiv:2605.06475v1 Announce Type: new Abstract: We introduce a probabilistic approach for dating historical manuscript pages from visual features alone. Instead of aggregating centuries into classes as is standard in the previous literature, we pose dating as an evidential deep regression problem over a continuous year axis, allowing our neural network to output a full predictive […]

AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

arXiv:2605.06651v1 Announce Type: new Abstract: We introduce the AI co-mathematician, a workbench for mathematicians to interactively leverage AI agents to pursue open-ended research. The AI co-mathematician is optimized to provide holistic support for the exploratory and iterative reality of mathematical workflows, including ideation, literature search, computational exploration, theorem proving and theory building. By providing an […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844