arXiv:2512.22364v2 Announce Type: replace-cross Abstract: While Text-to-SQL systems achieve high accuracy, existing efficiency metrics like the Valid Efficiency Score prioritize execution time, a metric we show is fundamentally decoupled from consumption-based cloud billing. This paper evaluates cloud query execution cost trade-offs between reasoning and non-reasoning Large Language Models by performing 180 Text-to-SQL query executions across […]
ButterflyViT: 354$times$ Expert Compression for Edge Vision Transformers
arXiv:2603.06746v1 Announce Type: cross Abstract: Deploying sparse Mixture of Experts(MoE) Vision Transformers remains a challenge due to linear expert memory scaling. Linear memory scaling stores $N$ independent expert weight matrices requiring $mathcalO(N_E cdot d^2)$ memory, which exceeds edge devices memory budget. Current compression methods like quantization, pruning and low-rank factorization reduce constant factors but leave […]
$OneMillion-Bench: How Far are Language Agents from Human Experts?
arXiv:2603.07980v1 Announce Type: cross Abstract: As language models (LMs) evolve from chat assistants to long-horizon agents capable of multi-step reasoning and tool use, existing benchmarks remain largely confined to structured or exam-style tasks that fall short of real-world professional demands. To this end, we introduce $OneMillion-Bench $OneMillion-Bench, a benchmark of 400 expert-curated tasks spanning Law, […]
Label-free pathological subtyping of non-small cell lung cancer using deep classification and virtual immunohistochemical staining
arXiv:2503.20817v2 Announce Type: replace Abstract: The differentiation between pathological subtypes of non-small cell lung cancer (NSCLC) is an essential step in guiding treatment options and prognosis. However, current clinical practice relies on multi-step staining and labelling processes that are time-intensive and costly, requiring highly specialised expertise. In this study, we propose a label-free methodology that […]
Toward a Physical Theory of Intelligence
arXiv:2601.00021v2 Announce Type: replace Abstract: While often treated as abstract algorithmic properties, intelligence and computation are ultimately physical processes constrained by conservation laws. We introduce the Conservation-Congruent Encoding (CCE) framework as a unified, substrate-neutral physical framework for studying intelligence. We propose that information processing emerges when open systems undergo irreversible transitions, carving out macroscopic states […]
Alignment-Aware and Reliability-Gated Multimodal Fusion for Unmanned Aerial Vehicle Detection Across Heterogeneous Thermal-Visual Sensors
arXiv:2603.08208v1 Announce Type: cross Abstract: Reliable unmanned aerial vehicle (UAV) detection is critical for autonomous airspace monitoring but remains challenging when integrating sensor streams that differ substantially in resolution, perspective, and field of view. Conventional fusion methods-such as wavelet-, Laplacian-, and decision-level approaches-often fail to preserve spatial correspondence across modalities and suffer from annotation of […]
SYNAPSE: Framework for Neuron Analysis and Perturbation in Sequence Encoding
arXiv:2603.08424v1 Announce Type: cross Abstract: In recent years, Artificial Intelligence has become a powerful partner for complex tasks such as data analysis, prediction, and problem-solving, yet its lack of transparency raises concerns about its reliability. In sensitive domains such as healthcare or cybersecurity, ensuring transparency, trustworthiness, and robustness is essential, since the consequences of wrong […]
Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference
arXiv:2602.13813v3 Announce Type: replace-cross Abstract: We introduce Pawsterior, a variational flow-matching framework for improved and extended simulation-based inference (SBI). Many SBI problems involve posteriors constrained by structured domains, such as bounded physical parameters or hybrid discrete-continuous variables, yet standard flow-matching methods typically operate in unconstrained spaces. This mismatch leads to inefficient learning and difficulty respecting […]
Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration
arXiv:2602.00913v2 Announce Type: replace-cross Abstract: Human value detection from single sentences is a sparse, imbalanced multi-label task. We study whether Schwartz higher-order (HO) categories help this setting on ValueEval’24 / ValuesML (74K English sentences) under a compute-frugal budget. Rather than proposing a new architecture, we compare direct supervised transformers, hard HO$rightarrow$values pipelines, Presence$rightarrow$HO$rightarrow$values cascades, compact […]
Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization
arXiv:2506.17252v4 Announce Type: replace-cross Abstract: Direct Preference Optimization (DPO) has emerged as an effective approach for aligning large language models (LLMs) with human preferences. However, its performance is highly dependent on the quality of the underlying human preference data. To address this bottleneck, prior work has explored various data selection strategies, but these methods often […]
LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning
arXiv:2603.01488v2 Announce Type: replace Abstract: Despite achieving remarkable success in complex tasks, Deep Reinforcement Learning (DRL) is still suffering from critical issues in practical applications, such as low data efficiency, lack of interpretability, and limited cross-environment transferability. However, the learned policy generating actions based on states are sensitive to the environmental changes, struggling to guarantee […]
From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models
arXiv:2501.00296v4 Announce Type: replace-cross Abstract: Our aim is to learn to solve long-horizon decision-making problems in complex robotics domains given low-level skills and a handful of short-horizon demonstrations containing sequences of images. To this end, we focus on learning abstract symbolic world models that facilitate zero-shot generalization to novel goals via planning. A critical component […]