arXiv:2511.02094v1 Announce Type: new Abstract: When designing reinforcement learning (RL) agents, a designer communicates the desired agent behavior through the definition of reward functions – numerical feedback given to the agent as reward or punishment for its actions. However, mapping desired behaviors to reward functions can be a difficult process, especially in complex environments such […]
TRACE: Textual Reasoning for Affordance Coordinate Extraction
arXiv:2511.01999v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) struggle to translate high-level instructions into the precise spatial affordances required for robotic manipulation. While visual Chain-of-Thought (CoT) methods exist, they are often computationally intensive. In this work, we introduce TRACE (Textual Reasoning for Affordance Coordinate Extraction), a novel methodology that integrates a textual Chain of Reasoning […]
How well do LLMs reason over tabular data, really?
arXiv:2505.07453v3 Announce Type: replace Abstract: Large Language Models (LLMs) excel in natural language tasks, but less is known about their reasoning capabilities over tabular data. Prior analyses devise evaluation strategies that poorly reflect an LLM’s realistic performance on tabular queries. Moreover, we have a limited understanding of the robustness of LLMs towards realistic variations in […]
RobustFSM: Submodular Maximization in Federated Setting with Malicious Clients
arXiv:2511.02029v1 Announce Type: cross Abstract: Submodular maximization is an optimization problem benefiting many machine learning applications, where we seek a small subset best representing an extremely large dataset. We focus on the federated setting where the data are locally owned by decentralized clients who have their own definitions for the quality of representability. This setting […]
Deep Value Benchmark: Measuring Whether Models Generalize Deep values or Shallow Preferences
arXiv:2511.02109v1 Announce Type: new Abstract: We introduce the Deep Value Benchmark (DVB), an evaluation framework that directly tests whether large language models (LLMs) learn fundamental human values or merely surface-level preferences. This distinction is critical for AI alignment: Systems that capture deeper values are likely to generalize human intentions robustly, while those that capture only […]
Text-VQA Aug: Pipelined Harnessing of Large Multimodal Models for Automated Synthesis
arXiv:2511.02046v1 Announce Type: cross Abstract: Creation of large-scale databases for Visual Question Answering tasks pertaining to the text data in a scene (text-VQA) involves skilful human annotation, which is tedious and challenging. With the advent of foundation models that handle vision and language modalities, and with the maturity of OCR systems, it is the need […]
AGNES: Adaptive Graph Neural Network and Dynamic Programming Hybrid Framework for Real-Time Nanopore Seed Chaining
arXiv:2510.16013v3 Announce Type: replace Abstract: Nanopore sequencing enables real-time long-read DNA sequencing with reads exceeding 10 kilobases, but inherent error rates of 12-15 percent present significant computational challenges for read alignment. The critical seed chaining step must connect exact k-mer matches between reads and reference genomes while filtering spurious matches, yet state-of-the-art methods rely on […]
Natural Building Blocks for Structured World Models: Theory, Evidence, and Scaling
arXiv:2511.02091v1 Announce Type: cross Abstract: The field of world modeling is fragmented, with researchers developing bespoke architectures that rarely build upon each other. We propose a framework that specifies the natural building blocks for structured world models based on the fundamental stochastic processes that any world model must capture: discrete processes (logic, symbols) and continuous […]
InsurAgent: A Large Language Model-Empowered Agent for Simulating Individual Behavior in Purchasing Flood Insurance
arXiv:2511.02119v1 Announce Type: new Abstract: Flood insurance is an effective strategy for individuals to mitigate disaster-related losses. However, participation rates among at-risk populations in the United States remain strikingly low. This gap underscores the need to understand and model the behavioral mechanisms underlying insurance decisions. Large language models (LLMs) have recently exhibited human-like intelligence across […]
Matrix Sensing with Kernel Optimal Loss: Robustness and Optimization Landscape
arXiv:2511.02122v1 Announce Type: cross Abstract: In this paper we study how the choice of loss functions of non-convex optimization problems affects their robustness and optimization landscape, through the study of noisy matrix sensing. In traditional regression tasks, mean squared error (MSE) loss is a common choice, but it can be unreliable for non-Gaussian or heavy-tailed […]