arXiv:2604.10696v1 Announce Type: new Abstract: We present Camyla, a system for fully autonomous research within the scientific domain of medical image segmentation. Camyla transforms raw datasets into literature-grounded research proposals, executable experiments, and complete manuscripts without human intervention. Autonomous experimentation over long horizons poses three interrelated challenges: search effort drifts toward unpromising directions, knowledge from […]
A Benchmark for Gap and Overlap Analysis as a Test of KG Task Readiness
arXiv:2604.10853v1 Announce Type: new Abstract: Task-oriented evaluation of knowledge graph (KG) quality increasingly asks whether an ontology-based representation can answer the competency questions that users actually care about, in a manner that is reproducible, explainable, and traceable to evidence. This paper adopts that perspective and focuses on gap and overlap analysis for policy-like documents (e.g., […]
A molecular clock for writing systems reveals the quantitative impact of imperial power on cultural evolution
arXiv:2604.10957v1 Announce Type: new Abstract: Writing systems are cultural replicators whose evolution has never been studied quantitatively at global scale. We compile the Global Script Database (GSD): 300 writing and notation systems, 50 binary structural characters, and 259 phylogenetic edges spanning 5,400 years. Applying four methods — phenetics, cladistics, Bayesian inference, and neural network clustering […]
Seven simple steps for log analysis in AI systems
arXiv:2604.09563v1 Announce Type: new Abstract: AI systems produce large volumes of logs as they interact with tools and users. Analysing these logs can help understand model capabilities, propensities, and behaviours, or assess whether an evaluation worked as intended. Researchers have started developing methods for log analysis, but a standardised approach is still missing. Here we […]
How complex behavioural contagion can prevent infectious diseases from becoming endemic
arXiv:2604.10995v1 Announce Type: new Abstract: Infectious disease transmission in human populations has a complex two-way interaction with changes in host behaviour. It is increasingly recognised that incorporating adaptive behavioural change into epidemic models is important for improving understanding of infectious disease dynamics and developing policy-relevant modelling tools. An important aspect of behavioural dynamics is social […]
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
arXiv:2604.11805v1 Announce Type: cross Abstract: We have witnessed remarkable advances in LLM reasoning capabilities with the advent of DeepSeek-R1. However, much of this progress has been fueled by the abundance of internet question-answer (QA) pairs, a major bottleneck going forward, since such data is limited in scale and concentrated mainly in domains like mathematics. In […]
AI Integrity: A New Paradigm for Verifiable AI Governance
arXiv:2604.11065v1 Announce Type: new Abstract: AI systems increasingly shape high-stakes decisions in healthcare, law, defense, and education, yet existing governance paradigms — AI Ethics, AI Safety, and AI Alignment — share a common limitation: they evaluate outcomes rather than verifying the reasoning process itself. This paper introduces AI Integrity, a concept defined as a state […]
Factorizing formal contexts from closures of necessity operators
arXiv:2604.09582v1 Announce Type: new Abstract: Factorizing datasets is an interesting process in a multitude of approaches, but many times it is not possible or efficient the computation of a factorization of the dataset. A method to obtain independent subcontexts of a formal context with Boolean data was proposed in~citedubois:2012, based on the operators used in […]
MADQRL: Distributed Quantum Reinforcement Learning Framework for Multi-Agent Environments
arXiv:2604.11131v1 Announce Type: new Abstract: Reinforcement learning (RL) is one of the most practical ways to learn from real-life use-cases. Motivated from the cognitive methods used by humans makes it a widely acceptable strategy in the field of artificial intelligence. Most of the environments used for RL are often high-dimensional, and traditional RL algorithms becomes […]
Diagnosing Retrieval vs. Utilization Bottlenecks in LLM Agent Memory
arXiv:2603.02473v2 Announce Type: replace Abstract: Memory-augmented LLM agents store and retrieve information from prior interactions, yet the relative importance of how memories are written versus how they are retrieved remains unclear. We introduce a diagnostic framework that analyzes how performance differences manifest across write strategies, retrieval methods, and memory utilization behavior, and apply it to […]
Consistency of AI-Generated Exercise Prescriptions: A Repeated Generation Study Using a Large Language Model
arXiv:2604.11287v1 Announce Type: new Abstract: Background: Large language models (LLMs) have been explored as tools for generating personalized exercise prescriptions, yet the consistency of outputs under identical conditions remains insufficiently examined. Objective: This study evaluated the intra-model consistency of LLM-generated exercise prescriptions using a repeated generation design. Methods: Six clinical scenarios were used to generate […]
Agentic Exploration of PDE Spaces using Latent Foundation Models for Parameterized Simulations
arXiv:2604.09584v1 Announce Type: new Abstract: Flow physics and more broadly physical phenomena governed by partial differential equations (PDEs), are inherently continuous, high-dimensional and often chaotic in nature. Traditionally, researchers have explored these rich spatiotemporal PDE solution spaces using laboratory experiments and/or computationally expensive numerical simulations. This severely limits automated and large-scale exploration, unlike domains such […]