arXiv:2507.03197v3 Announce Type: replace-cross Abstract: CD8+ “killer” T cells and CD4+ “helper” T cells play a central role in the adaptive immune system by recognizing antigens presented by Major Histocompatibility Complex (pMHC) molecules via T Cell Receptors (TCRs). Modeling binding between T cells and the pMHC complex is fundamental to understanding basic mechanisms of human […]
Generative Predictive Control: Flow Matching Policies for Dynamic and Difficult-to-Demonstrate Tasks
arXiv:2502.13406v3 Announce Type: replace-cross Abstract: Generative control policies have recently unlocked major progress in robotics. These methods produce action sequences via diffusion or flow matching, with training data provided by demonstrations. But existing methods come with two key limitations: they require expert demonstrations, which can be difficult to obtain, and they are limited to relatively […]
The Consensus Trap: Dissecting Subjectivity and the “Ground Truth” Illusion in Data Annotation
arXiv:2602.11318v2 Announce Type: replace Abstract: In machine learning, “ground truth” refers to the assumed correct labels used to train and evaluate models. However, the foundational “ground truth” paradigm rests on a positivistic fallacy that treats human disagreement as technical noise rather than a vital sociotechnical signal. This systematic literature review analyzes research published between 2020 […]
Position: Stop Anthropomorphizing Intermediate Tokens as Reasoning/Thinking Traces!
arXiv:2504.09762v3 Announce Type: replace Abstract: Intermediate token generation (ITG), where a model produces output before the solution, has become a standard method to improve the performance of language models on reasoning tasks. These intermediate tokens have been called sayreasoning traces or even saythoughts — implicitly anthropomorphizing the traces, and implying that these traces resemble steps […]
Conditionally Site-Independent Neural Evolution of Antibody Sequences
arXiv:2602.18982v2 Announce Type: replace-cross Abstract: Common deep learning approaches for antibody engineering focus on modeling the marginal distribution of sequences. By treating sequences as independent samples, however, these methods overlook affinity maturation as a rich and largely untapped source of information about the evolutionary process by which antibodies explore the underlying fitness landscape. In contrast, […]
Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries
arXiv:2603.04413v2 Announce Type: replace-cross Abstract: Meaning in human language is relational, context dependent, and emergent, arising from dynamic systems of signs rather than fixed word-concept mappings. In computational settings, this semiotic and interpretive complexity complicates the generation and evaluation of meaning. This article proposes an interdisciplinary framework for studying meaning in large language model (LLM) […]
LA-MARRVEL: A Knowledge-Grounded, Language-Aware LLM Framework for Clinically Robust Rare Disease Gene Prioritization
arXiv:2511.02263v4 Announce Type: replace Abstract: Rare disease diagnosis requires matching variant-bearing genes to complex patient phenotypes across large and heterogeneous evidence sources. This process remains time-intensive in current clinical interpretation pipelines. To overcome these limitations, We present LA-MARRVEL, a knowledge-grounded, language-aware LLM framework and designed for clinical robustness and practical deployment. LA-MARRVEL delivers a 12-15 […]
Estimation of Energy-dissipation Lower-bounds for Neuromorphic Learning-in-memory
arXiv:2402.14878v4 Announce Type: replace-cross Abstract: Neuromorphic or neurally-inspired optimizers rely on local but parallel parameter updates to solve problems that range from quadratic programming to Ising machines. An ideal realization of such an optimizer not only uses a compute-in-memory (CIM) paradigm to address the so-called memory-wall (i.e. energy dissipated due to repeated memory read access), […]
HCT-QA: A Benchmark for Question Answering on Human-Centric Tables
arXiv:2504.20047v3 Announce Type: replace-cross Abstract: Tabular data embedded in PDF files, web pages, and other types of documents is prevalent in various domains. These tables, which we call human-centric tables (HCTs for short), are dense in information but often exhibit complex structural and semantic layouts. To query these HCTs, some existing solutions focus on transforming […]
A Geometric Perspective on the Difficulties of Learning GNN-based SAT Solvers
arXiv:2508.21513v3 Announce Type: replace-cross Abstract: Graph Neural Networks (GNNs) have gathered increasing interest as learnable solvers of Boolean Satisfiability Problems (SATs), operating on graph representations of logical formulas. However, their performance degrades sharply on harder and more constrained instances, raising questions about architectural limitations. In this paper, we work towards a geometric explanation built upon […]
Exploiting Spatiotemporal Properties for Efficient Event-Driven Human Pose Estimation
arXiv:2512.06306v2 Announce Type: replace-cross Abstract: Human pose estimation focuses on predicting body keypoints to analyze human motion. Currently, most pose estimation tasks rely on conventional RGB cameras. In contrast, event cameras provide high temporal resolution and low latency, enabling robust estimation under challenging conditions and opening up new possibilities for pose estimation. However, most existing […]
DataChef: Cooking Up Optimal Data Recipes for LLM Adaptation via Reinforcement Learning
arXiv:2602.11089v2 Announce Type: replace-cross Abstract: In the current landscape of Large Language Models (LLMs), the curation of large-scale, high-quality training data is a primary driver of model performance. A key lever is the emphdata recipe, which comprises a data processing pipeline to transform raw sources into training corpora. Despite the growing use of LLMs to […]