arXiv:2512.10821v2 Announce Type: replace Abstract: From content moderation to content curation, applications requiring vision classifiers for visual concepts are rapidly expanding. Existing human-in-the-loop approaches typically assume users begin with a clear, stable concept understanding to be able to provide high-quality supervision. In reality, users often start with a vague idea and must iteratively refine it […]
Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration
arXiv:2602.03151v2 Announce Type: replace Abstract: Vision Language Model (VLM) typically assume complete modality input during inference. However, their effectiveness drops sharply when certain modalities are unavailable or incomplete. Current research on missing modality primarily faces two dilemmas: Prompt-based methods struggle to restore missing yet indispensable features and degrade the generalizability of VLM. Imputation-based approaches, lacking […]
Sequential learning theory for Markov genealogy processes
arXiv:2603.09033v2 Announce Type: replace Abstract: We introduce a filtration-based framework for studying when and why adding taxa improves phylodynamic inference, by constructing a natural ordering of observed tips and applying sequential Bayesian analysis to the resulting filtration. We decompose the expected variance reduction on taxa addition into learning, mismatch, and covariance components, classify estimands into […]
Metriplector: From Field Theory to Neural Architecture
arXiv:2603.29496v2 Announce Type: replace Abstract: We present Metriplector, a neural architecture primitive in which the input configures an abstract physical system — fields, sources, and operators — and the dynamics of that system is the computation. Multiple fields evolve via coupled metriplectic dynamics, and the stress-energy tensor T^munu, derived from Noether’s theorem, provides the readout. […]
MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models
arXiv:2408.11871v3 Announce Type: replace-cross Abstract: Fake news significantly influences decision-making processes by misleading individuals, organizations, and even governments. Large language models (LLMs), as part of generative AI, can amplify this problem by generating highly convincing fake news at scale, posing a significant threat to online information integrity. Therefore, understanding the motivations and mechanisms behind fake […]
Implicit Bias-Like Patterns in Reasoning Models
arXiv:2503.11572v4 Announce Type: replace-cross Abstract: Implicit biases refer to automatic mental processes that shape perceptions, judgments, and behaviors. Previous research on “implicit bias” in LLMs focused primarily on outputs rather than the processes underlying the outputs. We present the Reasoning Model Implicit Association Test (RM-IAT) to study implicit bias-like processing in reasoning models, LLMs that […]
LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
arXiv:2602.09924v3 Announce Type: replace-cross Abstract: Running LLMs with extended reasoning on every problem is expensive, but determining which inputs actually require additional compute remains challenging. We investigate whether their own likelihood of success is recoverable from their internal representations before generation, and if this signal can guide more efficient inference. We train linear probes on […]
Safe Decentralized Operation of EV Virtual Power Plant with Limited Network Visibility via Multi-Agent Reinforcement Learning
arXiv:2604.03278v1 Announce Type: cross Abstract: As power systems advance toward net-zero targets, behind-the-meter renewables are driving rapid growth in distributed energy resources (DERs). Virtual power plants (VPPs) increasingly coordinate these resources to support power distribution network (PDN) operation, with EV charging stations (EVCSs) emerging as a key asset due to their strong impact on local […]
Scaling DPPs for RAG: Density Meets Diversity
arXiv:2604.03240v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned with factual evidence and evolving corpora. Standard RAG pipelines construct context through relevance ranking, performing point-wise scoring between the user query and each corpora chunk. This formulation, however, ignores interactions […]
Why Attend to Everything? Focus is the Key
arXiv:2604.03260v1 Announce Type: cross Abstract: We introduce Focus, a method that learns which token pairs matter rather than approximating all of them. Learnable centroids assign tokens to groups; distant attention is restricted to same-group pairs while local attention operates at full resolution. Because all model weights stay frozen, Focus is purely additive: centroid-only training (as […]
Same World, Differently Given: History-Dependent Perceptual Reorganization in Artificial Agents
arXiv:2604.04637v1 Announce Type: new Abstract: What kind of internal organization would allow an artificial agent not only to adapt its behavior, but to sustain a history-sensitive perspective on its world? I present a minimal architecture in which a slow perspective latent $g$ feeds back into perception and is itself updated through perceptual processing. This allows […]
MemMachine: A Ground-Truth-Preserving Memory System for Personalized AI Agents
arXiv:2604.04853v1 Announce Type: new Abstract: Large Language Model (LLM) agents require persistent memory to maintain personalization, factual continuity, and long-horizon reasoning, yet standard context-window and retrieval-augmented generation (RAG) pipelines degrade over multi-session interactions. We present MemMachine, an open-source memory system that integrates short-term, long-term episodic, and profile memory within a ground-truth-preserving architecture that stores entire […]