arXiv:2512.20204v1 Announce Type: cross Abstract: Speech processing and translation technology have the potential to facilitate meetings of individuals who do not share any common language. To evaluate automatic systems for such a task, a versatile and realistic evaluation corpus is needed. Therefore, we create and present a corpus of cross-lingual dialogues between individuals without a […]
SlideTailor: Personalized Presentation Slide Generation for Scientific Papers
arXiv:2512.20292v1 Announce Type: cross Abstract: Automatic presentation slide generation can greatly streamline content creation. However, since preferences of each user may vary, existing under-specified formulations often lead to suboptimal results that fail to align with individual user needs. We introduce a novel task that conditions paper-to-slides generation on user-specified preferences. We propose a human behavior-inspired […]
TableGPT-R1: Advancing Tabular Reasoning Through Reinforcement Learning
arXiv:2512.20312v1 Announce Type: cross Abstract: Tabular data serves as the backbone of modern data analysis and scientific research. While Large Language Models (LLMs) fine-tuned via Supervised Fine-Tuning (SFT) have significantly improved natural language interaction with such structured data, they often fall short in handling the complex, multi-step reasoning and robust code execution required for real-world […]
Similarity Field Theory: A Mathematical Framework for Intelligence
arXiv:2509.18218v4 Announce Type: replace Abstract: We posit that persisting and transforming similarity relations form the structural basis of any comprehensible dynamic system. This paper introduces Similarity Field Theory, a mathematical framework that formalizes the principles governing similarity values among entities and their evolution. We define: (1) a similarity field $S: U times U to [0,1]$ […]
Enhancing Zero-Shot Time Series Forecasting in Off-the-Shelf LLMs via Noise Injection
arXiv:2512.20140v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated effectiveness as zero-shot time series (TS) forecasters. The key challenge lies in tokenizing TS data into textual representations that align with LLMs’ pre-trained knowledge. While existing work often relies on fine-tuning specialized modules to bridge this gap, a distinct, yet challenging, paradigm aims to […]
Modeling Non-Ergodic Path Effects Using Conditional Generative Model for Fourier Amplitude Spectra
arXiv:2512.19909v1 Announce Type: cross Abstract: Recent developments in non-ergodic ground-motion models (GMMs) explicitly model systematic spatial variations in source, site, and path effects, reducing standard deviation to 30-40% of ergodic models and enabling more accurate site-specific seismic hazard analysis. Current non-ergodic GMMs rely on Gaussian Process (GP) methods with prescribed correlation functions and thus have […]
A Bidirectional Gated Recurrent Unit Model for PUE Prediction in Data Centers
arXiv:2512.20161v1 Announce Type: new Abstract: Data centers account for significant global energy consumption and a carbon footprint. The recent increasing demand for edge computing and AI advancements drives the growth of data center storage capacity. Energy efficiency is a cost-effective way to combat climate change, cut energy costs, improve business competitiveness, and promote IT and […]
Generative Retrieval with Few-shot Indexing
arXiv:2408.02152v3 Announce Type: replace-cross Abstract: Existing generative retrieval (GR) methods rely on training-based indexing, which fine-tunes a model to memorise associations between queries and the document identifiers (docids) of relevant documents. Training-based indexing suffers from high training costs, under-utilisation of pre-trained knowledge in large language models (LLMs), and limited adaptability to dynamic document corpora. To […]
Concept Generalization in Humans and Large Language Models: Insights from the Number Game
arXiv:2512.20162v1 Announce Type: new Abstract: We compare human and large language model (LLM) generalization in the number game, a concept inference task. Using a Bayesian model as an analytical framework, we examined the inductive biases and inference strategies of humans and LLMs. The Bayesian model captured human behavior better than LLMs in that humans flexibly […]
Interaction Dataset of Autonomous Vehicles with Traffic Lights and Signs
arXiv:2501.12536v2 Announce Type: replace-cross Abstract: This paper presents the development of a comprehensive dataset capturing interactions between Autonomous Vehicles (AVs) and traffic control devices, specifically traffic lights and stop signs. Derived from the Waymo Motion dataset, our work addresses a critical gap in the existing literature by providing real-world trajectory data on how AVs navigate […]