arXiv:2512.01289v2 Announce Type: replace Abstract: Environmental, Social, and Governance (ESG) metric knowledge is inherently structured, connecting industries, reporting frameworks, metric categories, metrics, and calculation models through compositional dependencies, yet in practice this structure remains embedded implicitly in regulatory documents such as SASB, TCFD, and IFRS S2 and rarely exists as an explicit, governed, or machine-actionable […]
Bias-Aware Conformal Prediction for Metric-Based Imaging Pipelines
arXiv:2410.05263v2 Announce Type: replace-cross Abstract: Reliable confidence measures of metrics derived from medical imaging reconstruction pipelines would improve the standard of decision-making in many clinical workflows. Conformal Prediction (CP) provides a robust framework for producing calibrated prediction intervals, but standard CP formulations face a critical challenge in the imaging pipeline: common mismatches between image reconstruction […]
Prefill-Guided Thinking for zero-shot detection of AI-generated images
arXiv:2506.11031v4 Announce Type: replace-cross Abstract: Traditional supervised methods for detecting AI-generated images depend on large, curated datasets for training and fail to generalize to novel, out-of-domain image generators. As an alternative, we explore pre-trained Vision-Language Models (VLMs) for zero-shot detection of AI-generated images. We evaluate VLM performance on three diverse benchmarks encompassing synthetic images of […]
Beyond Data Privacy: New Privacy Risks for Large Language Models
arXiv:2509.14278v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) have achieved remarkable progress in natural language understanding, reasoning, and autonomous decision-making. However, these advancements have also come with significant privacy concerns. While significant research has focused on mitigating the data privacy risks of LLMs during various stages of model training, less attention has been paid […]
Key and Value Weights Are Probably All You Need: On the Necessity of the Query, Key, Value weight Triplet in Decoder-Only Transformers
arXiv:2510.23912v3 Announce Type: replace-cross Abstract: The Query, Key, Value weight triplet is a building block of current attention mechanisms in state-of-the-art LLMs. We theoretically investigate whether this triplet can be reduced, proving under simplifying assumptions that the Query weights are redundant, thereby reducing the number of non-embedding/lm-head parameters by over 8%. We validate the theory […]
Exploration vs Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
arXiv:2512.16912v3 Announce Type: replace-cross Abstract: This paper examines the exploration-exploitation trade-off in reinforcement learning with verifiable rewards (RLVR), a framework for improving the reasoning of Large Language Models (LLMs). Recent studies suggest that RLVR can elicit strong mathematical reasoning in LLMs through two seemingly paradoxical mechanisms: spurious rewards, which suppress exploitation by rewarding outcomes unrelated […]
Empowering LLMs for Structure-Based Drug Design via Exploration-Augmented Latent Inference
arXiv:2601.15333v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) possess strong representation and reasoning capabilities, but their application to structure-based drug design (SBDD) is limited by insufficient understanding of protein structures and unpredictable molecular generation. To address these challenges, we propose Exploration-Augmented Latent Inference for LLMs (ELILLM), a framework that reinterprets the LLM generation process […]
Funny or Persuasive, but Not Both: Evaluating Fine-Grained Multi-Concept Control in LLMs
arXiv:2601.18483v1 Announce Type: cross Abstract: Large Language Models (LLMs) offer strong generative capabilities, but many applications require explicit and textitfine-grained control over specific textual concepts, such as humor, persuasiveness, or formality. Prior approaches in prompting and representation engineering can provide coarse or single-attribute control, but systematic evaluation of multi-attribute settings remains limited. We introduce an […]
SMART: Scalable Mesh-free Aerodynamic Simulations from Raw Geometries using a Transformer-based Surrogate Model
arXiv:2601.18707v1 Announce Type: cross Abstract: Machine learning-based surrogate models have emerged as more efficient alternatives to numerical solvers for physical simulations over complex geometries, such as car bodies. Many existing models incorporate the simulation mesh as an additional input, thereby reducing prediction errors. However, generating a simulation mesh for new geometries is computationally costly. In […]
Subword-Based Comparative Linguistics across 242 Languages Using Wikipedia Glottosets
arXiv:2601.18791v1 Announce Type: cross Abstract: We present a large-scale comparative study of 242 Latin and Cyrillic-script languages using subword-based methodologies. By constructing ‘glottosets’ from Wikipedia lexicons, we introduce a framework for simultaneous cross-linguistic comparison via Byte-Pair Encoding (BPE). Our approach utilizes rank-based subword vectors to analyze vocabulary overlap, lexical divergence, and language similarity at scale. […]
NeuroKoop: Neural Koopman Fusion of Structural-Functional Connectomes for Identifying Prenatal Drug Exposure in Adolescents
arXiv:2508.16414v2 Announce Type: replace Abstract: Understanding how prenatal exposure to psychoactive substances such as cannabis shapes adolescent brain organization remains a critical challenge, complicated by the complexity of multimodal neuroimaging data and the limitations of conventional analytic methods. Existing approaches often fail to fully capture the complementary features embedded within structural and functional connectomes, constraining […]
Tandem Training for Language Models
arXiv:2510.13551v2 Announce Type: replace Abstract: As language models continue to rapidly improve, we can expect their actions and reasoning to become difficult or impossible for weaker agents and humans to follow, undermining interpretability and oversight. With an eye on long-term futures, we pursue methods that encourage models to produce solutions that remain intelligible to weaker […]