arXiv:2604.19245v2 Announce Type: replace-cross Abstract: Repair, an important resource for resolving trouble in human-human conversation, remains underexplored in human-LLM interaction. In this study, we investigate how LLMs engage in the interactive process of repair in multi-turn dialogues around solvable and unsolvable math questions. We examine whether models initiate repair themselves and how they respond to […]
Rates of forgetting for the sequentially Markov coalescent
arXiv:2604.20629v1 Announce Type: cross Abstract: The sequentially Markov coalescent (SMC) is a Markov jump process which models correlations in local genealogies across a chromosome. It has been used as a theoretical tool for studying linkage disequilibrium and identity-by-descent, and it also forms the basis of a class of statistical procedures for estimating population history and […]
Can “AI” Be a Doctor? A Study of Empathy, Readability, and Alignment in Clinical LLMs
arXiv:2604.20791v1 Announce Type: cross Abstract: Large Language Models (LLMs) are increasingly deployed in healthcare, yet their communicative alignment with clinical standards remains insufficiently quantified. We conduct a multidimensional evaluation of general-purpose and domain-specialized LLMs across structured medical explanations and real-world physician-patient interactions, analyzing semantic fidelity, readability, and affective resonance. Baseline models amplify affective polarity relative […]
Statistics of correlations in nonlinear recurrent neural networks
arXiv:2510.21742v2 Announce Type: replace Abstract: The statistics of correlations are central quantities characterizing the collective dynamics of recurrent neural networks. We derive exact expressions for the statistics of correlations of nonlinear recurrent networks in the limit of a large number N of neurons, including systematic 1/N corrections, in the regime of Gaussian quenched disorder. Our […]
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration
arXiv:2604.14116v2 Announce Type: replace Abstract: While Large Language Models (LLMs) have empowered AI research agents to perform isolated scientific tasks, automating complex, real-world workflows, such as LLM training, remains a significant challenge. In this paper, we introduce TREX, a multi-agent system that automates the entire LLM training life-cycle. By orchestrating collaboration between two core modules-the […]
Not All Rollouts are Useful: Down-Sampling Rollouts in LLM Reinforcement Learning
arXiv:2504.13818v5 Announce Type: replace-cross Abstract: Reinforcement learning with verifiable rewards (RLVR) has emerged as the leading approach for enhancing reasoning capabilities in large language models. However, it faces a fundamental compute and memory asymmetry: rollout generation is embarrassingly parallel and memory-light, whereas policy updates are communication-heavy and memory-intensive. To address this, we introduce PODS (Policy […]
Transformers Can Learn Connectivity in Some Graphs but Not Others
arXiv:2509.22343v2 Announce Type: replace-cross Abstract: Reasoning capability is essential to ensure the factual correctness of the responses of transformer-based Large Language Models (LLMs), and robust reasoning about transitive relations is instrumental in many settings, such as causal inference. Hence, it is essential to investigate the capability of transformers in the task of inferring transitive relations […]
LEAD: Breaking the No-Recovery Bottleneck in Long-Horizon Reasoning
arXiv:2603.06870v2 Announce Type: replace Abstract: Long-horizon execution in Large Language Models (LLMs) remains unstable even when high-level strategies are provided. Evaluating on controlled algorithmic puzzles, we demonstrate that while decomposition is essential for stability, extreme decomposition creates a “no-recovery bottleneck”. We show that this bottleneck becomes critical due to highly non-uniform error distribution, where consistent […]
Querying Inconsistent Prioritized Data with ORBITS: Algorithms, Implementation, and Experiments
arXiv:2202.07980v4 Announce Type: replace-cross Abstract: We investigate practical algorithms for inconsistency-tolerant query answering over prioritized knowledge bases, which consist of a logical theory, a set of facts, and a priority relation between conflicting facts. We consider three well-known semantics (AR, IAR and brave) based upon two notions of optimal repairs (Pareto and completion). Deciding whether […]
FLOSS: Federated Learning with Opt-Out and Straggler Support
arXiv:2507.23115v2 Announce Type: replace-cross Abstract: Previous work on data privacy in federated learning systems focuses on privacy-preserving operations for data from users who have agreed to share their data for training. However, modern data privacy agreements also empower users to use the system while opting out of sharing their data as desired. When combined with […]
AutoGraphAD: Unsupervised network anomaly detection using Variational Graph Autoencoders
arXiv:2511.17113v2 Announce Type: replace-cross Abstract: Network Intrusion Detection Systems (NIDS) are essential tools for detecting network attacks and intrusions. While extensive research has explored the use of supervised Machine Learning for attack detection and characterisation, these methods require accurately labelled datasets, which are very costly to obtain. Moreover, existing public datasets have limited and/or outdated […]
An Adaptive Horizon-Aware Model Selection Framework for Demand Forecasting under Horizon-Induced Degradation
arXiv:2602.13939v5 Announce Type: replace-cross Abstract: Business environments characterized by intermittent demand, high variability, and multi-step planning require model selection procedures aligned with future operational horizons rather than static test-horizon evaluation. Because no forecasting model is universally dominant, and rankings vary across metrics, demand structures, and forecast horizons, assigning an appropriate model to each series remains […]