arXiv:2507.20205v5 Announce Type: replace Abstract: Accurately characterizing higher-order interactions of brain regions and extracting interpretable organizational patterns from Functional Magnetic Resonance Imaging data is crucial for brain disease diagnosis. Current graph-based deep learning models primarily focus on pairwise or triadic patterns while neglecting signed higher-order interactions, limiting comprehensive understanding of brain-wide communication. We propose HOI-Brain, […]
Context Engineering: From Prompts to Corporate Multi-Agent Architecture
arXiv:2603.09619v2 Announce Type: replace Abstract: As artificial intelligence (AI) systems evolve from stateless chatbots to autonomous multi-step agents, prompt engineering (PE), the discipline of crafting individual queries, proves necessary but insufficient. This paper introduces context engineering (CE) as a standalone discipline concerned with designing, structuring, and managing the entire informational environment in which an AI […]
Guided Policy Optimization under Partial Observability
arXiv:2505.15418v2 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) in partially observable environments poses significant challenges due to the complexity of learning under uncertainty. While additional information, such as that available in simulations, can enhance training, effectively leveraging it remains an open problem. To address this, we introduce Guided Policy Optimization (GPO), a framework that co-trains […]
On Deepfake Voice Detection — It’s All in the Presentation
arXiv:2509.26471v2 Announce Type: replace-cross Abstract: While the technologies empowering malicious audio deepfakes have dramatically evolved in recent years due to generative AI advances, the same cannot be said of global research into spoofing (deepfake) countermeasures. This paper highlights how current deepfake datasets and research methodologies led to systems that failed to generalize to real world […]
Do LLMs have a Gender (Entropy) Bias?
arXiv:2505.20343v2 Announce Type: replace-cross Abstract: We investigate the existence and persistence of a specific type of gender bias in some of the popular LLMs and contribute a new benchmark dataset, RealWorldQuestioning (released on HuggingFace ), developed from real-world questions across four key domains in business and health contexts: education, jobs, personal financial management, and general […]
Epistemic diversity across language models mitigates knowledge collapse
arXiv:2512.15011v2 Announce Type: replace-cross Abstract: As artificial intelligence (AI) becomes more widely used, concerns are growing that model collapse could lead to knowledge collapse, i.e. a degradation to a narrow and inaccurate set of ideas. Prior work has demonstrated single-model collapse, defined as performance decay in an AI model trained on its own outputs. Inspired […]
RooflineBench: A Benchmarking Framework for On-Device LLMs via Roofline Analysis
arXiv:2602.11506v3 Announce Type: replace-cross Abstract: The transition toward localized intelligence through Small Language Models (SLMs) has intensified the need for rigorous performance characterization on resource-constrained edge hardware. However, objectively measuring the theoretical performance ceilings of diverse architectures across heterogeneous platforms remains a formidable challenge. In this work, we propose a systematic framework based on the […]
Biology and Physics
arXiv:2603.11234v2 Announce Type: replace-cross Abstract: This article frames the relation between biology and physics by characterizing the former as a subdiscipline rather than a special case of the latter. To do this, we posit biological physics as the science of living matter in contrast to classic biophysics, the study of organismal properties by physical techniques. […]
GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration
arXiv:2603.13068v1 Announce Type: cross Abstract: Geochemical anomaly detection plays a critical role in mineral exploration as deviations from regional geochemical baselines may indicate mineralization. Existing studies suffer from two key limitations: (1) single region scenarios which limit model generalizability; (2) proprietary datasets, which makes result reproduction unattainable. In this work, we introduce textbfGeoChemAD, an open-source […]
Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights
arXiv:2603.13186v1 Announce Type: cross Abstract: Prior approaches for membership privacy preservation usually update or retrain all weights in neural networks, which is costly and can lead to unnecessary utility loss or even more serious misalignment in predictions between training data and non-training data. In this work, we observed three insights: i) privacy vulnerability exists in […]
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents
arXiv:2505.22954v3 Announce Type: replace Abstract: Today’s AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would accelerate AI development and allow us to reap its benefits much sooner. Meta-learning can automate the discovery of novel algorithms, but is limited […]
AutoClimDS: Climate Data Science Agentic AI — A Knowledge Graph is All You Need
arXiv:2509.21553v2 Announce Type: replace Abstract: Climate data science remains constrained by fragmented data sources, heterogeneous formats, and steep technical expertise requirements. These barriers slow discovery, limit participation, and undermine reproducibility. We present AutoClimDS, a Minimum Viable Product (MVP) Agentic AI system that addresses these challenges by integrating a curated climate knowledge graph (KG) with a […]