arXiv:2511.17393v1 Announce Type: cross Abstract: Face verification is a significant component of identity authentication in various applications including online banking and secure access to personal devices. The majority of the existing face image datasets often suffer from notable biases related to race, gender, and other demographic characteristics, limiting the effectiveness and fairness of face verification […]
The promise and limits of LLMs in constructing proofs and hints for logic problems in intelligent tutoring systems
arXiv:2505.04736v2 Announce Type: replace Abstract: Intelligent tutoring systems have demonstrated effectiveness in teaching formal propositional logic proofs, but their reliance on template-based explanations limits their ability to provide personalized student feedback. While large language models (LLMs) offer promising capabilities for dynamic feedback generation, they risk producing hallucinations or pedagogically unsound explanations. We evaluated the stepwise […]
SCALEX: Scalable Concept and Latent Exploration for Diffusion Models
arXiv:2511.13750v2 Announce Type: replace-cross Abstract: Image generation models frequently encode social biases, including stereotypes tied to gender, race, and profession. Existing methods for analyzing these biases in diffusion models either focus narrowly on predefined categories or depend on manual interpretation of latent directions. These constraints limit scalability and hinder the discovery of subtle or unanticipated […]
You Only Forward Once: An Efficient Compositional Judging Paradigm
arXiv:2511.16600v2 Announce Type: replace Abstract: Multimodal large language models (MLLMs) show strong potential as judges. However, existing approaches face a fundamental trade-off: adapting MLLMs to output a single score misaligns with the generative nature of MLLMs and limits fine-grained requirement understanding, whereas autoregressively generating judging analyses is prohibitively slow in high-throughput settings. Observing that judgment […]
Quantum Masked Autoencoders for Vision Learning
arXiv:2511.17372v1 Announce Type: cross Abstract: Classical autoencoders are widely used to learn features of input data. To improve the feature learning, classical masked autoencoders extend classical autoencoders to learn the features of the original input sample in the presence of masked-out data. While quantum autoencoders exist, there is no design and implementation of quantum masked […]
Sometimes Painful but Certainly Promising: Feasibility and Trade-offs of Language Model Inference at the Edge
arXiv:2503.09114v2 Announce Type: replace-cross Abstract: The rapid rise of Language Models (LMs) has expanded the capabilities of natural language processing, powering applications from text generation to complex decision-making. While state-of-the-art LMs often boast hundreds of billions of parameters and are primarily deployed in data centers, recent trends show a growing focus on compact models-typically under […]
Genomic Next-Token Predictors are In-Context Learners
arXiv:2511.12797v2 Announce Type: replace-cross Abstract: In-context learning (ICL) — the capacity of a model to infer and apply abstract patterns from examples provided within its input — has been extensively studied in large language models trained for next-token prediction on human text. In fact, prior work often attributes this emergent behavior to distinctive statistical properties […]
Comprehensive Evaluation of Prototype Neural Networks
arXiv:2507.06819v3 Announce Type: replace-cross Abstract: Prototype models are an important method for explainable artificial intelligence (XAI) and interpretable machine learning. In this paper, we perform an in-depth analysis of a set of prominent prototype models including ProtoPNet, ProtoPool and PIPNet. For their assessment, we apply a comprehensive set of metrics. In addition to applying standard […]
Is Phase Really Needed for Weakly-Supervised Dereverberation ?
arXiv:2511.17346v1 Announce Type: cross Abstract: In unsupervised or weakly-supervised approaches for speech dereverberation, the target clean (dry) signals are considered to be unknown during training. In that context, evaluating to what extent information can be retrieved from the sole knowledge of reverberant (wet) speech becomes critical. This work investigates the role of the reverberant (wet) […]
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
arXiv:2503.09878v4 Announce Type: replace-cross Abstract: Vision foundation models (VFMs) such as DINO have led to a paradigm shift in 2D camera-based perception towards extracting generalized features to support many downstream tasks. Recent works introduce self-supervised cross-modal knowledge distillation (KD) as a way to transfer these powerful generalization capabilities into 3D LiDAR-based models. However, they either […]