Let’s Roll a BiFTA: Bi-refinement for Fine-grained Text-visual Alignment in Vision-Language Models

arXiv:2601.20419v1 Announce Type: cross Abstract: Recent research has shown that aligning fine-grained text descriptions with localized image patches can significantly improve the zero-shot performance of pre-trained vision-language models (e.g., CLIP). However, we find that both fine-grained text descriptions and localized image patches often contain redundant information, making text-visual alignment less effective. In this paper, we […]

Automated Benchmark Generation from Domain Guidelines Informed by Bloom’s Taxonomy

arXiv:2601.20253v1 Announce Type: cross Abstract: Open-ended question answering (QA) evaluates a model’s ability to perform contextualized reasoning beyond factual recall. This challenge is especially acute in practice-based domains, where knowledge is procedural and grounded in professional judgment, while most existing LLM benchmarks depend on pre-existing human exam datasets that are often unavailable in such settings. […]

Physically Guided Visual Mass Estimation from a Single RGB Image

arXiv:2601.20303v1 Announce Type: cross Abstract: Estimating object mass from visual input is challenging because mass depends jointly on geometric volume and material-dependent density, neither of which is directly observable from RGB appearance. Consequently, mass prediction from pixels is ill-posed and therefore benefits from physically meaningful representations to constrain the space of plausible solutions. We propose […]

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

arXiv:2601.20055v1 Announce Type: cross Abstract: Despite the syntactic fluency of Large Language Models (LLMs), ensuring their logical correctness in high-stakes domains remains a fundamental challenge. We present a neurosymbolic framework that combines LLMs with SMT solvers to produce verification-guided answers through iterative refinement. Our approach decomposes LLM outputs into atomic claims, autoformalizes them into first-order […]

LinguaMap: Which Layers of LLMs Speak Your Language and How to Tune Them?

arXiv:2601.20009v1 Announce Type: cross Abstract: Despite multilingual pretraining, large language models often struggle with non-English tasks, particularly in language control, the ability to respond in the intended language. We identify and characterize two key failure modes: the multilingual transfer bottleneck (correct language, incorrect task response) and the language consistency bottleneck (correct task response, wrong language). […]

Taming Toxic Talk: Using chatbots to intervene with users posting toxic comments

arXiv:2601.20100v1 Announce Type: cross Abstract: Generative AI chatbots have proven surprisingly effective at persuading people to change their beliefs and attitudes in lab settings. However, the practical implications of these findings are not yet clear. In this work, we explore the impact of rehabilitative conversations with generative AI chatbots on users who share toxic content […]

Control systems for synthetic biology and a case-study in cell fate reprogramming

arXiv:2601.20135v1 Announce Type: cross Abstract: This paper gives an overview of the use of control systems engineering in synthetic biology, motivated by applications such as cell therapy and cell fate reprogramming for regenerative medicine. A ubiquitous problem in these and other applications is the ability to control the concentration of specific regulatory factors in the […]

MALLOC: Benchmarking the Memory-aware Long Sequence Compression for Large Sequential Recommendation

arXiv:2601.20234v1 Announce Type: cross Abstract: The scaling law, which indicates that model performance improves with increasing dataset and model capacity, has fueled a growing trend in expanding recommendation models in both industry and academia. However, the advent of large-scale recommenders also brings significantly higher computational costs, particularly under the long-sequence dependencies inherent in the user […]

The Forecast After the Forecast: A Post-Processing Shift in Time Series

arXiv:2601.20280v1 Announce Type: cross Abstract: Time series forecasting has long been dominated by advances in model architecture, with recent progress driven by deep learning and hybrid statistical techniques. However, as forecasting models approach diminishing returns in accuracy, a critical yet underexplored opportunity emerges: the strategic use of post-processing. In this paper, we address the last-mile […]

Beyond Speedup — Utilizing KV Cache for Sampling and Reasoning

arXiv:2601.20326v1 Announce Type: cross Abstract: KV caches, typically used only to speed up autoregressive decoding, encode contextual information that can be reused for downstream tasks at no extra cost. We propose treating the KV cache as a lightweight representation, eliminating the need to recompute or store full hidden states. Despite being weaker than dedicated embeddings, […]

FedRD: Reducing Divergences for Generalized Federated Learning via Heterogeneity-aware Parameter Guidance

arXiv:2601.20397v1 Announce Type: cross Abstract: Heterogeneous federated learning (HFL) aims to ensure effective and privacy-preserving collaboration among different entities. As newly joined clients require significant adjustments and additional training to align with the existing system, the problem of generalizing federated learning models to unseen clients under heterogeneous data has become progressively crucial. Consequently, we highlight […]

An explainable framework for the relationship between dementia and glucose metabolism patterns

arXiv:2601.20480v1 Announce Type: cross Abstract: High-dimensional neuroimaging data presents challenges for assessing neurodegenerative diseases due to complex non-linear relationships. Variational Autoencoders (VAEs) can encode scans into lower-dimensional latent spaces capturing disease-relevant features. We propose a semi-supervised VAE framework with a flexible similarity regularization term that aligns selected latent variables with clinical or biomarker measures of […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844