arXiv:2601.20878v1 Announce Type: new Abstract: Image restoration of biological structures in microscopy poses unique challenges for preserving fine textures and sharp edges. While recent GAN-based image restoration formulations have introduced frequency-domain losses for natural images, microscopy images pose distinct challenges with large dynamic ranges and sparse but critical structures with spatially-variable contrast. Inspired by the […]
Enhancing Language Models for Robust Greenwashing Detection
arXiv:2601.21722v1 Announce Type: cross Abstract: Sustainability reports are critical for ESG assessment, yet greenwashing and vague claims often undermine their reliability. Existing NLP models lack robustness to these practices, typically relying on surface-level patterns that generalize poorly. We propose a parameter-efficient framework that structures LLM latent spaces by combining contrastive learning with an ordinal ranking […]
Assessing the Business Process Modeling Competences of Large Language Models
arXiv:2601.21787v1 Announce Type: cross Abstract: The creation of Business Process Model and Notation (BPMN) models is a complex and time-consuming task requiring both domain knowledge and proficiency in modeling conventions. Recent advances in large language models (LLMs) have significantly expanded the possibilities for generating BPMN models directly from natural language, building upon earlier text-to-process methods […]
MoHETS: Long-term Time Series Forecasting with Mixture-of-Heterogeneous-Experts
arXiv:2601.21866v1 Announce Type: cross Abstract: Real-world multivariate time series can exhibit intricate multi-scale structures, including global trends, local periodicities, and non-stationary regimes, which makes long-horizon forecasting challenging. Although sparse Mixture-of-Experts (MoE) approaches improve scalability and specialization, they typically rely on homogeneous MLP experts that poorly capture the diverse temporal dynamics of time series data. We […]
A Unified Theory of Sparse Dictionary Learning in Mechanistic Interpretability: Piecewise Biconvexity and Spurious Minima
arXiv:2512.05534v3 Announce Type: replace-cross Abstract: As AI models achieve remarkable capabilities across diverse domains, understanding what representations they learn and how they encode concepts has become increasingly important for both scientific progress and trustworthy deployment. Recent works in mechanistic interpretability have widely reported that neural networks represent meaningful concepts as linear directions in their representation […]
Stackelberg Coupling of Online Representation Learning and Reinforcement Learning
arXiv:2508.07452v3 Announce Type: replace-cross Abstract: Deep Q-learning jointly learns representations and values within monolithic networks, promising beneficial co-adaptation between features and value estimates. Although this architecture has attained substantial success, the coupling between representation and value learning creates instability as representations must constantly adapt to non-stationary value targets, while value estimates depend on these shifting […]
CMOOD: Concept-based Multi-label OOD Detection
arXiv:2411.13578v3 Announce Type: replace-cross Abstract: How can models effectively detect out-of-distribution (OOD) samples in complex, multi-label settings without extensive retraining? Existing OOD detection methods struggle to capture the intricate semantic relationships and label co-occurrences inherent in multi-label settings, often requiring large amounts of training data and failing to generalize to unseen label combinations. While large […]
scDataset: Scalable Data Loading for Deep Learning on Large-Scale Single-Cell Omics
arXiv:2506.01883v2 Announce Type: replace-cross Abstract: Training deep learning models on single-cell datasets with hundreds of millions of cells requires loading data from disk, as these datasets exceed available memory. While random sampling provides the data diversity needed for effective training, it is prohibitively slow due to the random access pattern overhead, whereas sequential streaming achieves […]
An explainable vision transformer with transfer learning based efficient drought stress identification
arXiv:2407.21666v3 Announce Type: replace-cross Abstract: Early detection of drought stress is critical for taking timely measures for reducing crop loss before the drought impact becomes irreversible. The subtle phenotypical and physiological changes in response to drought stress are captured by non-invasive imaging techniques and these imaging data serve as valuable resource for machine learning methods […]
Navigating the Rabbit Hole: Emergent Biases in LLM-Generated Attack Narratives Targeting Mental Health Groups
arXiv:2504.06160v4 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have been shown to demonstrate imbalanced biases against certain groups. However, the study of unprovoked targeted attacks by LLMs towards at-risk populations remains underexplored. Our paper presents three novel contributions: (1) the explicit evaluation of LLM-generated attacks on highly vulnerable mental health groups; (2) a network-based […]
Refine-POI: Reinforcement Fine-Tuned Large Language Models for Next Point-of-Interest Recommendation
arXiv:2506.21599v3 Announce Type: replace-cross Abstract: Advancing large language models (LLMs) for the next point-of-interest (POI) recommendation task faces two fundamental challenges: (i) although existing methods produce semantic IDs that incorporate semantic information, their topology-blind indexing fails to preserve semantic continuity, meaning that proximity in ID values does not mirror the coherence of the underlying semantics; […]
Robust Filter Attention: Self-Attention as a Parallel State Estimator
arXiv:2509.04154v4 Announce Type: replace-cross Abstract: We introduce Robust Filter Attention (RFA), an attention mechanism that reformulates self-attention as parallel robust filtering under a latent stochastic differential equation (SDE) prior, where analytically propagated uncertainty defines a time-dependent precision prior over attention weights. This formulation integrates key advantages of existing positional encodings: it preserves RoPE-style rotational structure […]