arXiv:2510.19429v1 Announce Type: new Abstract: We address the challenge of adopting language models (LMs) for embodied tasks in dynamic environments, where online access to large-scale inference engines or symbolic planners is constrained due to latency, connectivity, and resource limitations. To this end, we present NeSyPr, a novel embodied reasoning framework that compiles knowledge via neurosymbolic […]
AgenticMath: Enhancing LLM Reasoning via Agentic-based Math Data Generation
arXiv:2510.19361v1 Announce Type: cross Abstract: The creation of high-quality datasets to improve Large Language Model (LLM) reasoning remains a significant challenge, as current methods often suffer from generating low-quality/incorrect answers and limited information richness from available data sources. To address this, we propose AgenticMath, a novel agentic pipeline for generating high-quality mathematical question-answer pairs to […]
NAACL2025 Tutorial: Adaptation of Large Language Models
arXiv:2504.03931v3 Announce Type: replace-cross Abstract: This tutorial on adaptation of LLMs is designed to address the growing demand for models that go beyond the static capabilities of generic LLMs by providing an overview of dynamic, domain-specific, and task-adaptive LLM adaptation techniques. While general LLMs have demonstrated strong generalization across a variety of tasks, they often […]
EchoFake: A Replay-Aware Dataset for Practical Speech Deepfake Detection
arXiv:2510.19414v1 Announce Type: cross Abstract: The growing prevalence of speech deepfakes has raised serious concerns, particularly in real-world scenarios such as telephone fraud and identity theft. While many anti-spoofing systems have demonstrated promising performance on lab-generated synthetic speech, they often fail when confronted with physical replay attacks-a common and low-cost form of attack used in […]
KnowMol: Advancing Molecular Large Language Models with Multi-Level Chemical Knowledge
arXiv:2510.19484v1 Announce Type: new Abstract: The molecular large language models have garnered widespread attention due to their promising potential on molecular applications. However, current molecular large language models face significant limitations in understanding molecules due to inadequate textual descriptions and suboptimal molecular representation strategies during pretraining. To address these challenges, we introduce KnowMol-100K, a large-scale […]
Universal Quantitative Abstraction: Categorical Duality and Logical Completeness for Probabilistic Systems
arXiv:2510.19444v1 Announce Type: cross Abstract: A unified theory of quantitative abstraction is presented for probabilistic systems that links category theory, optimal transport, and quantitative modal logic. At its core is a canonical $ varepsilon $-quotient endowed with a universal property: among all $ varepsilon $-abstractions, it is the most informative one that respects a prescribed […]
Horizon Reduction Makes RL Scalable
arXiv:2506.04168v3 Announce Type: replace-cross Abstract: In this work, we study the scalability of offline reinforcement learning (RL) algorithms. In principle, a truly scalable offline RL algorithm should be able to solve any given problem, regardless of its complexity, given sufficient data, compute, and model capacity. We investigate if and how current offline RL algorithms match […]
Graph Unlearning Meets Influence-aware Negative Preference Optimization
arXiv:2510.19479v1 Announce Type: cross Abstract: Recent advancements in graph unlearning models have enhanced model utility by preserving the node representation essentially invariant, while using gradient ascent on the forget set to achieve unlearning. However, this approach causes a drastic degradation in model utility during the unlearning process due to the rapid divergence speed of gradient […]
Interactive visualization of kidney micro-compartmental segmentations and associated pathomics on whole slide images
arXiv:2510.19499v1 Announce Type: new Abstract: Application of machine learning techniques enables segmentation of functional tissue units in histology whole-slide images (WSIs). We built a pipeline to apply previously validated segmentation models of kidney structures and extract quantitative features from these structures. Such quantitative analysis also requires qualitative inspection of results for quality control, exploration, and […]
Modeling realistic human behavior using generative agents in a multimodal transport system: Software architecture and Application to Toulouse
arXiv:2510.19497v1 Announce Type: cross Abstract: Modeling realistic human behaviour to understand people’s mode choices in order to propose personalised mobility solutions remains challenging. This paper presents an architecture for modeling realistic human mobility behavior in complex multimodal transport systems, demonstrated through a case study in Toulouse, France. We apply Large Language Models (LLMs) within an […]