arXiv:2510.27063v1 Announce Type: cross Abstract: Given two algorithms for the same problem, can we determine whether they are meaningfully different? In full generality, the question is uncomputable, and empirically it is muddied by competing notions of similarity. Yet, in many applications (such as clone detection or program synthesis) a pragmatic and consistent similarity metric is […]
How Do Proteins Fold?
arXiv:2510.27074v1 Announce Type: new Abstract: How proteins fold remains a central unsolved problem in biology. While the idea of a folding code embedded in the amino acid sequence was introduced more than 6 decades ago, this code remains undefined. While we now have powerful predictive tools to predict the final native structure of proteins, we […]
Expressive Range Characterization of Open Text-to-Audio Models
arXiv:2510.27102v1 Announce Type: cross Abstract: Text-to-audio models are a type of generative model that produces audio output in response to a given textual prompt. Although level generators and the properties of the functional content that they create (e.g., playability) dominate most discourse in procedurally generated content (PCG), games that emotionally resonate with players tend to […]
More of the Same: Persistent Representational Harms Under Increased Representation
arXiv:2503.00333v3 Announce Type: replace-cross Abstract: To recognize and mitigate the harms of generative AI systems, it is crucial to consider whether and how different societal groups are represented by these systems. A critical gap emerges when naively measuring or improving who is represented, as this does not consider how people are represented. In this work, […]
MARIA: A Framework for Marginal Risk Assessment without Ground Truth in AI Systems
arXiv:2510.27163v1 Announce Type: cross Abstract: Before deploying an AI system to replace an existing process, it must be compared with the incumbent to ensure improvement without added risk. Traditional evaluation relies on ground truth for both systems, but this is often unavailable due to delayed or unknowable outcomes, high costs, or incomplete data, especially for […]
CombiGraph-Vis: A Curated Multimodal Olympiad Benchmark for Discrete Mathematical Reasoning
arXiv:2510.27094v1 Announce Type: new Abstract: State-of-the-art (SOTA) LLMs have progressed from struggling on proof-based Olympiad problems to solving most of the IMO 2025 problems, with leading systems reportedly handling 5 of 6 problems. Given this progress, we assess how well these models can grade proofs: detecting errors, judging their severity, and assigning fair scores beyond […]
FMint-SDE: A Multimodal Foundation Model for Accelerating Numerical Simulation of SDEs via Error Correction
arXiv:2510.27173v1 Announce Type: cross Abstract: Fast and accurate simulation of dynamical systems is a fundamental challenge across scientific and engineering domains. Traditional numerical integrators often face a trade-off between accuracy and computational efficiency, while existing neural network-based approaches typically require training a separate model for each case. To overcome these limitations, we introduce a novel […]
GAIA: A Foundation Model for Operational Atmospheric Dynamics
arXiv:2505.18179v2 Announce Type: replace-cross Abstract: We introduce GAIA (Geospatial Artificial Intelligence for Atmospheres), a hybrid self-supervised geospatial foundation model that fuses Masked Autoencoders (MAE) with self-distillation with no labels (DINO) to generate semantically rich representations from global geostationary satellite imagery. Pre-trained on 15 years of globally-merged infrared observations (2001-2015), GAIA learns disentangled representations that capture […]
Vectorized Online POMDP Planning
arXiv:2510.27191v1 Announce Type: cross Abstract: Planning under partial observability is an essential capability of autonomous robots. The Partially Observable Markov Decision Process (POMDP) provides a powerful framework for planning under partial observability problems, capturing the stochastic effects of actions and the limited information available through noisy observations. POMDP solving could benefit tremendously from massive parallelization […]
Glia: A Human-Inspired AI for Automated Systems Design and Optimization
arXiv:2510.27176v1 Announce Type: new Abstract: Can an AI autonomously design mechanisms for computer systems on par with the creativity and reasoning of human experts? We present Glia, an AI architecture for networked systems design that uses large language models (LLMs) in a human-inspired, multi-agent workflow. Each agent specializes in reasoning, experimentation, and analysis, collaborating through […]