April 23, 2026 – Page 12 – dijee Pharma Intelligence

Mitigating Prompt-Induced Cognitive Biases in General-Purpose AI for Software Engineering

arXiv:2604.16756v2 Announce Type: replace-cross Abstract: Prompt-induced cognitive biases are changes in a general-purpose AI (GPAI) system’s decisions caused solely by biased wording in the input (e.g., framing, anchors), not task logic. In software engineering (SE) decision support (where problem statements and requirements are natural language) small phrasing shifts (e.g., popularity hints or outcome reveals) can […]

April 23, 2026

CHORUS: An Agentic Framework for Generating Realistic Deliberation Data

arXiv:2604.20651v1 Announce Type: new Abstract: Understanding the intricate dynamics of online discourse depends on large-scale deliberation data, a resource that remains scarce across interactive web platforms due to restrictive accessibility policies, ethical concerns and inconsistent data quality. In this paper, we propose Chorus, an agentic framework, which orchestrates LLM-powered actors with behaviorally consistent personas to […]

April 23, 2026

Onyx: Cost-Efficient Disk-Oblivious ANN Search

arXiv:2604.20401v1 Announce Type: cross Abstract: Approximate nearest neighbor (ANN) search in AI systems increasingly handles sensitive data on third-party infrastructure. Trusted execution environments (TEEs) offer protection, but cost-efficient deployments must rely on external SSDs, which leaks user queries through disk access patterns to the host. Oblivious RAM (ORAM) can hide these access patterns but at […]

April 23, 2026

SWE-chat: Coding Agent Interactions From Real Users in the Wild

arXiv:2604.20779v1 Announce Type: new Abstract: AI coding agents are being adopted at scale, yet we lack empirical evidence on how people actually use them and how much of their output is useful in practice. We present SWE-chat, the first large-scale dataset of real coding agent sessions collected from open-source developers in the wild. The dataset […]

April 23, 2026

Explainable Iterative Data Visualisation Refinement via an LLM Agent

arXiv:2604.15319v2 Announce Type: replace-cross Abstract: Exploratory analysis of high-dimensional data relies on embedding the data into a low-dimensional space (typically 2D or 3D), based on which visualization plot is produced to uncover meaningful structures and to communicate geometric and distributional data characteristics. However, finding a suitable algorithm configuration, particularly hyperparameter setting, to produce a visualization […]

April 23, 2026

Transparent Screening for LLM Inference and Training Impacts

arXiv:2604.19757v1 Announce Type: cross Abstract: This paper presents a transparent screening framework for estimating inference and training impacts of current large language models under limited observability. The framework converts natural-language application descriptions into bounded environmental estimates and supports a comparative online observatory of current market models. Rather than claiming direct measurement for opaque proprietary services, […]

April 23, 2026

CyberCertBench: Evaluating LLMs in Cybersecurity Certification Knowledge

arXiv:2604.20389v1 Announce Type: cross Abstract: The rapid evolution and use of Large Language Models (LLMs) in professional workflows require an evaluation of their domain-specific knowledge against industry standards. We introduceCyberCertBench, a new suite of Multiple Choice Question Answering (MCQA) benchmarks derived from industry recognized certifications. CyberCertBench evaluates LLM domain knowledgeagainst the professional standards of Information […]

April 23, 2026

What Makes a Good AI Review? Concern-Level Diagnostics for AI Peer Review

arXiv:2604.19998v1 Announce Type: new Abstract: Evaluating AI-generated reviews by verdict agreement is widely recognized as insufficient, yet current alternatives rarely audit which concerns a system identifies, how it prioritizes them, or whether those priorities align with the review rationale that shaped the final assessment. We propose concern alignment, a diagnostic framework that evaluates AI reviews […]

April 23, 2026

Compressing Sequences in the Latent Embedding Space: $K$-Token Merging for Large Language Models

arXiv:2604.15153v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) incur significant computational and memory costs when processing long prompts, as full self-attention scales quadratically with input length. Token compression aims to address this challenge by reducing the number of tokens representing inputs. However, existing prompt-compression approaches primarily operate in token space and overlook inefficiencies in […]

April 23, 2026

EvoAgent: An Evolvable Agent Framework with Skill Learning and Multi-Agent Delegation

arXiv:2604.20133v1 Announce Type: new Abstract: This paper proposes EvoAgent – an evolvable large language model (LLM) agent framework that integrates structured skill learning with a hierarchical sub-agent delegation mechanism. EvoAgent models skills as multi-file structured capability units equipped with triggering mechanisms and evolutionary metadata, and enables continuous skill generation and optimization through a user-feedback-driven closed-loop […]

April 23, 2026

AI models of unstable flow exhibit hallucination

arXiv:2604.20372v1 Announce Type: cross Abstract: We report the first systematic evidence of hallucination in AI models of fluid dynamics, demonstrated in the canonical problem of hydrodynamically unstable transport known as viscous fingering. AI-based modeling of flow with instabilities remains challenging because rapidly evolving, multiscale fingering patterns are difficult to resolve accurately. We identify solutions that […]

April 23, 2026

Memory-Augmented LLM-based Multi-Agent System for Automated Feature Generation on Tabular Data

arXiv:2604.20261v1 Announce Type: new Abstract: Automated feature generation extracts informative features from raw tabular data without manual intervention and is crucial for accurate, generalizable machine learning. Traditional methods rely on predefined operator libraries and cannot leverage task semantics, limiting their ability to produce diverse, high-value features for complex tasks. Recent Large Language Model (LLM)-based approaches […]

April 23, 2026

Subscribe for Updates