arXiv:2509.15651v2 Announce Type: replace-cross Abstract: Assessing the impact the training data on machine learning models is crucial for understanding the behavior of the model, enhancing the transparency, and selecting training data. Influence function provides a theoretical framework for quantifying the effect of training data points on model’s performance given a specific test data. However, the […]
Compositional Steering of Large Language Models with Steering Tokens
arXiv:2601.05062v2 Announce Type: replace-cross Abstract: Deploying LLMs in real-world applications requires controllable output that satisfies multiple desiderata at the same time. While existing work extensively addresses LLM steering for a single behavior, textitcompositional steering — i.e., steering LLMs simultaneously towards multiple behaviors — remains an underexplored problem. In this work, we propose emphcompositional steering tokens […]
Geometric coherence of single-cell CRISPR perturbations reveals regulatory architecture and predicts cellular stress
arXiv:2604.16642v1 Announce Type: new Abstract: Genome engineering has achieved remarkable sequence-level precision, yet predicting the transcriptomic state that a cell will occupy after perturbation remains an open problem. Single-cell CRISPR screens measure how far cells move from their unperturbed state, but this effect magnitude ignores a fundamental question: do the cells move together? Two perturbations […]
Capture Timing-Attention of Events in Clinical Time Series
arXiv:2602.10385v2 Announce Type: replace-cross Abstract: Automatically discovering personalized sequential events from large-scale time-series data is crucial for enabling precision medicine in clinical research, yet it remains a formidable challenge even for contemporary AI models. For example, while transformers capture rich associations, they are mostly agnostic to event timing and ordering, thereby bypassing potential causal reasoning. […]
ContractEval: A Benchmark for Evaluating Contract-Satisfying Assertions in Code Generation
arXiv:2510.12047v4 Announce Type: replace Abstract: Current code generation evaluation measures functional correctness on well-formed inputs that satisfy all input preconditions. This paradigm has a critical limitation: task descriptions often leave these preconditions implicit, while evaluation filters out inputs that violate them. As a result, generated code may achieve high pass@k scores while failing to enforce […]
Evaluating Privilege Usage of Agents with Real-World Tools
arXiv:2603.28166v2 Announce Type: replace-cross Abstract: Equipping LLM agents with real-world tools can substantially improve productivity. However, granting agents autonomy over tool use also transfers the associated privileges to both the agent and the underlying LLM. Improper privilege usage may lead to serious consequences, including information leakage and infrastructure damage. While several benchmarks have been built […]
MLE-Toolbox: An Open-Source Toolbox for Comprehensive EEG and MEG Data Analysis
arXiv:2604.16463v1 Announce Type: new Abstract: MLE-Toolbox is a comprehensive open-source MATLAB toolbox for end-to-end analysis of magnetoencephalography (MEG) and electroencephalography (EEG) data. Inspired by widely used neuroimaging platforms such as Brainstorm and FieldTrip, it integrates the full analysis pipeline within a unified and user-friendly graphical interface (GUI), covering raw data import, preprocessing, source localization, functional […]
Retrieval as Generation: A Unified Framework with Self-Triggered Information Planning
arXiv:2604.11407v2 Announce Type: replace-cross Abstract: We revisit retrieval-augmented generation (RAG) by embedding retrieval control directly into generation. Instead of treating retrieval as an external intervention, we express retrieval decisions within token-level decoding, enabling end-to-end coordination without additional controllers or classifiers. Under the paradigm of Retrieval as Generation, we propose textbfGRIP (textbfGeneration-guided textbfRetrieval with textbfInformation textbfPlanning), […]
Healthcare AI for Automation or Allocation? A Transaction Cost Economics Framework
arXiv:2604.16465v1 Announce Type: new Abstract: Healthcare productivity is shaped not only by clinical complexity but by the costs of coordinating work under uncertainty. Transaction-cost economics offers a theory of these coordination frictions, yet has rarely been operationalised at task level across health occupations. Using task statements and frequency weights from the O*NET occupational database, we […]
Towards Green Wearable Computing: A Physics-Aware Spiking Neural Network for Energy-Efficient IMU-based Human Activity Recognition
arXiv:2604.10458v2 Announce Type: replace-cross Abstract: Wearable IMU-based Human Activity Recognition (HAR) relies heavily on Deep Neural Networks (DNNs), which are burdened by immense computational and buffering demands. Their power-hungry floating-point operations and rigid requirement to process complete temporal windows severely cripple battery-constrained edge devices. While Spiking Neural Networks (SNNs) offer extreme event-driven energy efficiency, standard […]
Support Sufficiency as Consequence-Sensitive Compression in Belief Arbitration
arXiv:2604.16434v1 Announce Type: new Abstract: When a system commits to a hypothesis, much of the evidential structure behind that commitment is lost to compression. Standard accounts assume that selected content and scalar confidence suffice for downstream control. This paper argues that they do not, and that determining what must survive compression is itself a consequence-sensitive […]
Semantic Consensus: Process-Aware Conflict Detection and Resolution for Enterprise Multi-Agent LLM Systems
arXiv:2604.16339v1 Announce Type: new Abstract: Multi-agent large language model (LLM) systems are rapidly emerging as the dominant architecture for enterprise AI automation, yet production deployments exhibit failure rates between 41% and 86.7%, with nearly 79% of failures originating from specification and coordination issues rather than model capability limitations. This paper identifies Semantic Intent Divergence–the phenomenon […]