October 23, 2025 – Page 6 – DIJEE Pharma Intelligence

The Zero-Step Thinking: An Empirical Study of Mode Selection as Harder Early Exit in Reasoning Models

arXiv:2510.19176v1 Announce Type: new Abstract: Reasoning models have demonstrated exceptional performance in tasks such as mathematics and logical reasoning, primarily due to their ability to engage in step-by-step thinking during the reasoning process. However, this often leads to overthinking, resulting in unnecessary computational overhead. To address this issue, Mode Selection aims to automatically decide between […]

October 23, 2025

FnRGNN: Distribution-aware Fairness in Graph Neural Network

arXiv:2510.19257v1 Announce Type: cross Abstract: Graph Neural Networks (GNNs) excel at learning from structured data, yet fairness in regression tasks remains underexplored. Existing approaches mainly target classification and representation-level debiasing, which cannot fully address the continuous nature of node-level regression. We propose FnRGNN, a fairness-aware in-processing framework for GNN-based node regression that applies interventions at […]

October 23, 2025

The MUSE Benchmark: Probing Music Perception and Auditory Relational Reasoning in Audio LLMS

arXiv:2510.19055v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have demonstrated capabilities in audio understanding, but current evaluations may obscure fundamental weaknesses in relational reasoning. We introduce the Music Understanding and Structural Evaluation (MUSE) Benchmark, an open-source resource with 10 tasks designed to probe fundamental music perception skills. We evaluate four SOTA models (Gemini […]

October 23, 2025

Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimization

arXiv:2510.19325v1 Announce Type: cross Abstract: Text summarization is a crucial task that requires the simultaneous optimization of multiple objectives, including consistency, coherence, relevance, and fluency, which presents considerable challenges. Although large language models (LLMs) have demonstrated remarkable performance, enhanced by reinforcement learning (RL), few studies have focused on optimizing the multi-objective problem of summarization through […]

October 23, 2025

Rectifying Shortcut Behaviors in Preference-based Reward Learning

arXiv:2510.19050v1 Announce Type: new Abstract: In reinforcement learning from human feedback, preference-based reward models play a central role in aligning large language models to human-aligned behavior. However, recent studies show that these models are prone to reward hacking and often fail to generalize well due to over-optimization. They achieve high reward scores by exploiting shortcuts, […]

October 23, 2025

Learning To Defer To A Population With Limited Demonstrations

arXiv:2510.19351v1 Announce Type: cross Abstract: This paper addresses the critical data scarcity that hinders the practical deployment of learning to defer (L2D) systems to the population. We introduce a context-aware, semi-supervised framework that uses meta-learning to generate expert-specific embeddings from only a few demonstrations. We demonstrate the efficacy of a dual-purpose mechanism, where these embeddings […]

October 23, 2025

Timely Clinical Diagnosis through Active Test Selection

arXiv:2510.18988v1 Announce Type: new Abstract: There is growing interest in using machine learning (ML) to support clinical diag- nosis, but most approaches rely on static, fully observed datasets and fail to reflect the sequential, resource-aware reasoning clinicians use in practice. Diagnosis remains complex and error prone, especially in high-pressure or resource-limited settings, underscoring the need […]

October 23, 2025

FairNet: Dynamic Fairness Correction without Performance Loss via Contrastive Conditional LoRA

arXiv:2510.19421v1 Announce Type: cross Abstract: Ensuring fairness in machine learning models is a critical challenge. Existing debiasing methods often compromise performance, rely on static correction strategies, and struggle with data sparsity, particularly within minority groups. Furthermore, their utilization of sensitive attributes is often suboptimal, either depending excessively on complete attribute labeling or disregarding these attributes […]

October 23, 2025

ACT: Agentic Classification Tree

arXiv:2509.26433v2 Announce Type: replace-cross Abstract: When used in high-stakes settings, AI systems are expected to produce decisions that are transparent, interpretable, and auditable, a requirement increasingly expected by regulations. Decision trees such as CART provide clear and verifiable rules, but they are restricted to structured tabular data and cannot operate directly on unstructured inputs such […]

October 23, 2025

Using Non-Expert Data to Robustify Imitation Learning via Offline Reinforcement Learning

arXiv:2510.19495v1 Announce Type: cross Abstract: Imitation learning has proven effective for training robots to perform complex tasks from expert human demonstrations. However, it remains limited by its reliance on high-quality, task-specific data, restricting adaptability to the diverse range of real-world object configurations and scenarios. In contrast, non-expert data — such as play data, suboptimal demonstrations, […]

October 23, 2025

Subscribe for Updates