arXiv:2604.00022v1 Announce Type: cross Abstract: Multi-dimensional rubric-based dialogue evaluation is widely used to assess conversational AI, yet its criterion validity — whether quality scores are associated with the downstream outcomes they are meant to serve — remains largely untested. We address this gap through a two-phase study on a major Chinese matchmaking platform, testing a […]
Genetic algorithms for multi-omic feature selection: a comparative study in cancer survival analysis
arXiv:2604.00065v1 Announce Type: new Abstract: Multi-omic datasets offer opportunities for improved biomarker discovery in cancer research, but their high dimensionality and limited sample sizes make identifying compact and effective biomarker panels challenging. Feature selection in large-scale omics can be efficiently addressed by combining machine learning with genetic algorithms, which naturally support multi-objective optimization of predictive […]
GenoBERT: A Language Model for Accurate Genotype Imputation
arXiv:2604.00058v1 Announce Type: new Abstract: Genotype imputation enables dense variant coverage for genome-wide association and risk-prediction studies, yet conventional reference-panel methods remain limited by ancestry bias and reduced rare-variant accuracy. We present Genotype Bidirectional Encoder Representations from Transformers (GenoBERT), a transformer-based, reference-free framework that tokenizes phased genotypes and uses a self-attention mechanism to capture both […]
Brevity Constraints Reverse Performance Hierarchies in Language Models
arXiv:2604.00025v1 Announce Type: cross Abstract: Standard evaluation protocols reveal a counterintuitive phenomenon: on 7.7% of benchmark problems spanning five datasets, larger language models underperform smaller ones by 28.4 percentage points despite 10-100x more parameters. Through systematic evaluation of 31 models (0.5B-405B parameters) across 1,485 problems, we identify the mechanism as spontaneous scale-dependent verbosity that introduces […]
Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images
arXiv:2511.14702v4 Announce Type: replace-cross Abstract: Accurate segmentation of myocardial scar from late gadolinium enhanced (LGE) cardiac MRI is essential for evaluating tissue viability, yet remains challenging due to variable contrast and imaging artifacts. Electrocardiogram (ECG) signals provide complementary physiological information, as conduction abnormalities can help localize or suggest scarred myocardial regions. In this work, we […]
Geometric-Photometric Event-based 3D Gaussian Ray Tracing
arXiv:2512.18640v2 Announce Type: replace-cross Abstract: Event cameras offer a high temporal resolution over traditional frame-based cameras, which makes them suitable for motion and structure estimation. However, it has been unclear how event-based 3D Gaussian Splatting (3DGS) approaches could leverage fine-grained temporal information of sparse events. This work proposes GPERT, a framework to address the trade-off […]
A Safety-Aware Role-Orchestrated Multi-Agent LLM Framework for Behavioral Health Communication Simulation
arXiv:2604.00249v1 Announce Type: new Abstract: Single-agent large language model (LLM) systems struggle to simultaneously support diverse conversational functions and maintain safety in behavioral health communication. We propose a safety-aware, role-orchestrated multi-agent LLM framework designed to simulate supportive behavioral health dialogue through coordinated, role-differentiated agents. Conversational responsibilities are decomposed across specialized agents, including empathy-focused, action-oriented, and […]
AI Agents Can Already Autonomously Perform Experimental High Energy Physics
arXiv:2603.20179v2 Announce Type: replace-cross Abstract: Large language model-based AI agents are now able to autonomously execute substantial portions of a high energy physics (HEP) analysis pipeline with minimal expert-curated input. Given access to a HEP dataset, an execution framework, and a corpus of prior experimental literature, we find that Claude Code succeeds in automating all […]
Human-in-the-Loop Control of Objective Drift in LLM-Assisted Computer Science Education
arXiv:2604.00281v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly embedded in computer science education through AI-assisted programming tools, yet such workflows often exhibit objective drift, in which locally plausible outputs diverge from stated task specifications. Existing instructional responses frequently emphasize tool-specific prompting practices, limiting durability as AI platforms evolve. This paper adopts a […]
PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding
arXiv:2604.00886v1 Announce Type: cross Abstract: Document understanding and GUI interaction are among the highest-value applications of Vision-Language Models (VLMs), yet they impose exceptionally heavy computational burden: fine-grained text and small UI elements demand high-resolution inputs that produce tens of thousands of visual tokens. We observe that this cost is largely wasteful — across document and […]
DriftScript: A Domain-Specific Language for Programming Non-Axiomatic Reasoning Agents
arXiv:2604.00043v1 Announce Type: cross Abstract: Non-Axiomatic Reasoning Systems (NARS) provide a framework for building adaptive agents that operate under insufficient knowledge and resources. However, the standard input language, Narsese, poses a usability barrier: its dense symbolic notation, overloaded punctuation, and implicit conventions make programs difficult to read, write, and maintain. We present DriftScript, a Lisp-like […]
Improvisational Games as a Benchmark for Social Intelligence of AI Agents: The Case of Connections
arXiv:2604.00284v1 Announce Type: new Abstract: We formally introduce a improvisational wordplay game called Connections to explore reasoning capabilities of AI agents. Playing Connections combines skills in knowledge retrieval, summarization and awareness of cognitive states of other agents. We show how the game serves as a good benchmark for social intelligence abilities of language model based […]