arXiv:2510.15511v4 Announce Type: replace-cross Abstract: Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs could map to the same output and prevent exact recovery of the input from a model’s representations. In this paper, we challenge this view. First, we prove mathematically that transformer language models mapping discrete input […]
AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents
arXiv:2603.12564v1 Announce Type: cross Abstract: Tool-augmented LLM agents increasingly serve as multi-turn advisors in high-stakes domains, yet their evaluation relies on ranking-quality metrics that measure what is recommended but not whether it is safe for the user. We introduce a paired-trajectory protocol that replays real financial dialogues under clean and contaminated tool-output conditions across seven […]
Development of Ontological Knowledge Bases by Leveraging Large Language Models
arXiv:2601.10436v2 Announce Type: replace-cross Abstract: Ontological Knowledge Bases (OKBs) play a vital role in structuring domain-specific knowledge and serve as a foundation for effective knowledge management systems. However, their traditional manual development poses significant challenges related to scalability, consistency, and adaptability. Recent advancements in Generative AI, particularly Large Language Models (LLMs), offer promising solutions for […]
Stake the Points: Structure-Faithful Instance Unlearning
arXiv:2603.12915v1 Announce Type: cross Abstract: Machine unlearning (MU) addresses privacy risks in pretrained models. The main goal of MU is to remove the influence of designated data while preserving the utility of retained knowledge. Achieving this goal requires preserving semantic relations among retained instances, which existing studies often overlook. We observe that without such preservation, […]
Learnable Koopman-Enhanced Transformer-Based Time Series Forecasting with Spectral Control
arXiv:2602.02592v2 Announce Type: replace-cross Abstract: This paper proposes a unified family of learnable Koopman operator parameterizations that integrate linear dynamical systems theory with modern deep learning forecasting architectures. We introduce four learnable Koopman variants-scalar-gated, per-mode gated, MLP-shaped spectral mapping, and low-rank Koopman operators which generalize and interpolate between strictly stable Koopman operators and unconstrained linear […]
FedBPrompt: Federated Domain Generalization Person Re-Identification via Body Distribution Aware Visual Prompts
arXiv:2603.12912v1 Announce Type: cross Abstract: Federated Domain Generalization for Person Re-Identification (FedDG-ReID) learns domain-invariant representations from decentralized data. While Vision Transformer (ViT) is widely adopted, its global attention often fails to distinguish pedestrians from high similarity backgrounds or diverse viewpoints — a challenge amplified by cross-client distribution shifts in FedDG-ReID. To address this, we propose […]
Aligning Language Models from User Interactions
arXiv:2603.12273v1 Announce Type: cross Abstract: Multi-turn user interactions are among the most abundant data produced by language models, yet we lack effective methods to learn from them. While typically discarded, these interactions often contain useful information: follow-up user messages may indicate that a response was incorrect, failed to follow an instruction, or did not align […]
Semantic Invariance in Agentic AI
arXiv:2603.13173v1 Announce Type: new Abstract: Large Language Models (LLMs) increasingly serve as autonomous reasoning agents in decision support, scientific problem-solving, and multi-agent coordination systems. However, deploying LLM agents in consequential applications requires assurance that their reasoning remains stable under semantically equivalent input variations, a property we term semantic invariance.Standard benchmark evaluations, which assess accuracy on […]
Task-Specific Knowledge Distillation via Intermediate Probes
arXiv:2603.12270v1 Announce Type: cross Abstract: Knowledge distillation from large language models (LLMs) assumes that the teacher’s output distribution is a high-quality training signal. On reasoning tasks, this assumption is frequently violated. A model’s intermediate representations may encode the correct answer, yet this information is lost or distorted through the vocabulary projection, where prompt formatting and […]
Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration
arXiv:2603.12351v1 Announce Type: cross Abstract: Collecting multiple types of data on the same set of subjects is common in modern scientific applications including, genomics, metabolomics, and neuroimaging. Joint and Individual Variance Explained (JIVE) seeks a low-rank approximation of the joint variation between two or more sets of features captured on common subjects and isolates this […]
Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images
arXiv:2603.12445v1 Announce Type: cross Abstract: Convolutional Neural Networks have shown promising effectiveness in identifying different types of cancer from radiographs. However, the opaque nature of CNNs makes it difficult to fully understand the way they operate, limiting their assessment to empirical evaluation. Here we study the soundness of the standard practices by which CNNs are […]
A Geometrically-Grounded Drive for MDL-Based Optimization in Deep Learning
arXiv:2603.12304v1 Announce Type: cross Abstract: This paper introduces a novel optimization framework that fundamentally integrates the Minimum Description Length (MDL) principle into the training dynamics of deep neural networks. Moving beyond its conventional role as a model selection criterion, we reformulate MDL as an active, adaptive driving force within the optimization process itself. The core […]