April 2, 2026 – Page 7 – dijee Pharma Intelligence

Mitigating Content Effects on Reasoning in Language Models through Fine-Grained Activation Steering

arXiv:2505.12189v3 Announce Type: replace Abstract: Large language models (LLMs) exhibit reasoning biases, often conflating content plausibility with formal logical validity. This can lead to wrong inferences in critical domains, where plausible arguments are incorrectly deemed logically valid or vice versa. This paper investigates how content biases on reasoning can be mitigated through activation steering, an […]

April 2, 2026

Children’s Intelligence Tests Pose Challenges for MLLMs? KidGym: A 2D Grid-Based Reasoning Benchmark for MLLMs

arXiv:2603.20209v3 Announce Type: replace-cross Abstract: Multimodal Large Language Models (MLLMs) combine the linguistic strengths of LLMs with the ability to process multimodal data, enbaling them to address a broader range of visual tasks. Because MLLMs aim at more general, human-like competence than language-only models, we take inspiration from the Wechsler Intelligence Scales – an established […]

April 2, 2026

Meta-Learning and Meta-Reinforcement Learning — Tracing the Path towards DeepMind’s Adaptive Agent

arXiv:2602.19837v2 Announce Type: replace Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability that standard machine learning models struggle to replicate due to their reliance on task-specific training. Meta-learning overcomes this limitation by allowing models to acquire transferable knowledge from various tasks, enabling rapid adaptation to new challenges […]

April 2, 2026

To Memorize or to Retrieve: Scaling Laws for RAG-Considerate Pretraining

arXiv:2604.00715v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) improves language model (LM) performance by providing relevant context at test time for knowledge-intensive situations. However, the relationship between parametric knowledge acquired during pretraining and non-parametric knowledge accessed via retrieval remains poorly understood, especially under fixed data budgets. In this work, we systematically study the trade-off between […]

April 2, 2026

A Bilevel Integer Programming Approach for the Synchronous Attractor Control Problem

arXiv:2604.01018v1 Announce Type: cross Abstract: Boolean networks are dynamical models of disease development in which the activation levels of genes are represented by binary variables. Given a Boolean network, controls represent mutations or medical treatments that fix the activation levels of selected genes so that all states in every attractor (i.e., long-term recurrent states) satisfy […]

April 2, 2026

How Motivation Relates to Generative AI Use: A Large-Scale Survey of Mexican High School Students

arXiv:2603.19263v2 Announce Type: replace-cross Abstract: This study examined how high school students with different motivational profiles use generative AI tools in math and writing. Through K-means clustering analysis of survey data from 6,793 Mexican high school students, we identified three distinct motivational profiles based on self-concept and perceived subject value. Results revealed distinct domain-specific AI […]

April 2, 2026

Brainstacks: Cross-Domain Cognitive Capabilities via Frozen MoE-LoRA Stacks for Continual LLM Learning

arXiv:2604.01152v1 Announce Type: cross Abstract: We present Brainstacks, a modular architecture for continual multi-domain fine-tuning of large language models that packages domain expertise as frozen adapter stacks composing additively on a shared frozen base at inference. Five interlocking components: (1) MoE-LoRA with Shazeer-style noisy top-2 routing across all seven transformer projections under QLoRA 4-bit quantization […]

April 2, 2026

AutoEG: Exploiting Known Third-Party Vulnerabilities in Black-Box Web Applications

arXiv:2604.00704v1 Announce Type: cross Abstract: Large-scale web applications are widely deployed with complex third-party components, inheriting security risks arising from component vulnerabilities. Security assessment is therefore required to determine whether such known vulnerabilities remain practically exploitable in real applications. Penetration testing is a widely adopted approach that validates exploitability by launching concrete attacks against known […]

April 2, 2026

Implementation of Support Vector Machines using Reaction Networks

arXiv:2503.19115v2 Announce Type: replace Abstract: Can machine learning algorithms be implemented using chemistry? We demonstrate that this is possible in the case of support vector machines (SVMs). SVMs are powerful tools for data classification, leveraging Vapnik-Chervonenkis theory to handle high-dimensional data and small datasets effectively. In this work, we propose a chemical reaction network scheme […]

April 2, 2026

OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

arXiv:2603.17205v2 Announce Type: replace-cross Abstract: Domain-specific finetuning is essential for dense retrievers, yet not all training pairs contribute equally to the learning process. We introduce OPERA, a data pruning framework that exploits this heterogeneity to improve both the effectiveness and efficiency of retrieval model adaptation. We first investigate static pruning (SP), which retains only high-similarity […]

April 2, 2026

Finite-State Controllers for (Hidden-Model) POMDPs using Deep Reinforcement Learning

arXiv:2602.08734v2 Announce Type: replace Abstract: Solving partially observable Markov decision processes (POMDPs) requires computing policies under imperfect state information. Despite recent advances, the scalability of existing POMDP solvers remains limited. Moreover, many settings require a policy that is robust across multiple POMDPs, further aggravating the scalability issue. We propose the Lexpop framework for POMDP solving. […]

April 2, 2026

Learning to Hint for Reinforcement Learning

arXiv:2604.00698v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) is widely used for reinforcement learning with verifiable rewards, but it often suffers from advantage collapse: when all rollouts in a group receive the same reward, the group yields zero relative advantage and thus no learning signal. For example, if a question is too hard […]

April 2, 2026

Subscribe for Updates