arXiv:2512.11582v1 Announce Type: cross Abstract: The development of foundation models for functional magnetic resonance imaging (fMRI) time series holds significant promise for predicting phenotypes related to disease and cognition. Current models, however, are often trained using a mask-and-reconstruct objective on small brain regions. This focus on low-level information leads to representations that are sensitive to […]
TRACE: A Framework for Analyzing and Enhancing Stepwise Reasoning in Vision-Language Models
arXiv:2512.05943v3 Announce Type: replace Abstract: Reliable mathematical and scientific reasoning remains an open challenge for large vision-language models. Standard final-answer evaluation often masks reasoning errors, allowing silent failures to persist. To address this gap, we introduce TRACE, a framework for Transparent Reasoning And Consistency Evaluation that diagnoses reasoning trajectories rather than only end results. At […]
KVSwap: Disk-aware KV Cache Offloading for Long-Context On-device Inference
arXiv:2511.11907v2 Announce Type: replace-cross Abstract: Language models (LMs) underpin emerging mobile and embedded AI applications like meeting and video summarization and document analysis, which often require processing multiple long-context inputs. Running an LM locally on-device improves privacy, enables offline use, and reduces cost, but long-context inference quickly hits a emphmemory capacity wall as the key-value […]
Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning
arXiv:2512.10534v2 Announce Type: replace Abstract: Large language model (LLM) agents exhibit strong mathematical problem-solving abilities and can even solve International Mathematical Olympiad (IMO) level problems with the assistance of formal proof systems. However, due to weak heuristics for auxiliary constructions, AI for geometry problem solving remains dominated by expert models such as AlphaGeometry 2, which […]
Multi-temporal Calving Front Segmentation
arXiv:2512.11560v1 Announce Type: cross Abstract: The calving fronts of marine-terminating glaciers undergo constant changes. These changes significantly affect the glacier’s mass and dynamics, demanding continuous monitoring. To address this need, deep learning models were developed that can automatically delineate the calving front in Synthetic Aperture Radar imagery. However, these models often struggle to correctly classify […]
Assumption-Lean Post-Integrated Inference with Surrogate Control Outcomes
arXiv:2410.04996v5 Announce Type: replace-cross Abstract: Data integration methods aim to extract low-dimensional embeddings from high-dimensional outcomes to remove unwanted variations, such as batch effects and unmeasured covariates, across heterogeneous datasets. However, multiple hypothesis testing after integration can be biased due to data-dependent processes. We introduce a robust post-integrated inference method that accounts for latent heterogeneity […]
Textual Self-attention Network: Test-Time Preference Optimization through Textual Gradient-based Attention
arXiv:2511.06682v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have demonstrated remarkable generalization capabilities, but aligning their outputs with human preferences typically requires expensive supervised fine-tuning. Recent test-time methods leverage textual feedback to overcome this, but they often critique and revise a single candidate response, lacking a principled mechanism to systematically analyze, weigh, and synthesize […]
FuncGenFoil: Airfoil Generation and Editing Model in Function Space
arXiv:2502.10712v4 Announce Type: replace-cross Abstract: Aircraft manufacturing is the jewel in the crown of industry, in which generating high-fidelity airfoil geometries with controllable and editable representations remains a fundamental challenge. Existing deep learning methods, which typically rely on predefined parametric representations (e.g., B’ezier) or discrete point sets, face an inherent trade-off between expressive power and […]
DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry
arXiv:2512.11558v1 Announce Type: cross Abstract: Reliable interpretation of multimodal data in dentistry is essential for automated oral healthcare, yet current multimodal large language models (MLLMs) struggle to capture fine-grained dental visual details and lack sufficient reasoning ability for precise diagnosis. To address these limitations, we present DentalGPT, a specialized dental MLLM developed through high-quality domain […]
MOAT: Evaluating LMMs for Capability Integration and Instruction Grounding
arXiv:2503.09348v2 Announce Type: replace-cross Abstract: Large multimodal models (LMMs) have demonstrated significant potential as generalists in vision-language (VL) tasks. However, adoption of LMMs in real-world tasks is hindered by their poor performance in tasks that require a combination of VL capabilities, as well as in tasks that involve the grounding of complex text or visual […]