MetaHQ: Harmonized, high-quality metadata annotations of public omics samples and studies

arXiv:2602.07805v2 Announce Type: replace Abstract: Public omics databases like the Gene Expression Omnibus and the Sequence Read Archive offer substantial opportunities for data reuse to address novel biomedical questions. However, it is still difficult to find samples and studies of interest since they are described by free-text metadata and lack standardized annotations. To address this […]

The Economics of AI Supply Chain Regulation

arXiv:2603.12630v1 Announce Type: cross Abstract: The rise of foundation models has driven the emergence of AI supply chains, where upstream foundation model providers offer fine-tuning and inference services to downstream firms developing domain-specific applications. Downstream firms pay providers to use their computing infrastructure to fine-tune models with proprietary data, creating a co-creation dynamic that enhances […]

LightMoE: Reducing Mixture-of-Experts Redundancy through Expert Replacing

arXiv:2603.12645v1 Announce Type: cross Abstract: Mixture-of-Experts (MoE) based Large Language Models (LLMs) have demonstrated impressive performance and computational efficiency. However, their deployment is often constrained by substantial memory demands, primarily due to the need to load numerous expert modules. While existing expert compression techniques like pruning or merging attempt to mitigate this, they often suffer […]

Continual Learning in Large Language Models: Methods, Challenges, and Opportunities

arXiv:2603.12658v1 Announce Type: cross Abstract: Continual learning (CL) has emerged as a pivotal paradigm to enable large language models (LLMs) to dynamically adapt to evolving knowledge and sequential tasks while mitigating catastrophic forgetting-a critical limitation of the static pre-training paradigm inherent to modern LLMs. This survey presents a comprehensive overview of CL methodologies tailored for […]

Developing the PsyCogMetrics AI Lab to Evaluate Large Language Models and Advance Cognitive Science — A Three-Cycle Action Design Science Study

arXiv:2603.13126v1 Announce Type: new Abstract: This study presents the development of the PsyCogMetrics AI Lab (psycogmetrics.ai), an integrated, cloud-based platform that operationalizes psychometric and cognitive-science methodologies for Large Language Model (LLM) evaluation. Framed as a three-cycle Action Design Science study, the Relevance Cycle identifies key limitations in current evaluation methods and unfulfilled stakeholder needs. The […]

RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction

arXiv:2603.12666v1 Announce Type: cross Abstract: Retrosynthesis prediction is a core task in organic synthesis that aims to predict reactants for a given product molecule. Traditionally, chemists select a plausible bond disconnection and derive corresponding reactants, which is time-consuming and requires substantial expertise. While recent advancements in molecular large language models (LLMs) have made progress, many […]

Superficial Safety Alignment Hypothesis

arXiv:2410.10862v3 Announce Type: replace-cross Abstract: As large language models (LLMs) are overwhelmingly more and more integrated into various applications, ensuring they generate safe responses is a pressing need. Previous studies on alignment have largely focused on general instruction-following but have often overlooked the distinct properties of safety alignment, such as the brittleness of safety mechanisms. […]

Steve-Evolving: Open-World Embodied Self-Evolution via Fine-Grained Diagnosis and Dual-Track Knowledge Distillation

arXiv:2603.13131v1 Announce Type: new Abstract: Open-world embodied agents must solve long-horizon tasks where the main bottleneck is not single-step planning quality but how interaction experience is organized and evolved. To this end, we present Steve-Evolving, a non-parametric self-evolving framework that tightly couples fine-grained execution diagnosis with dual-track knowledge distillation in a closed loop. The method […]

Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers

arXiv:2603.12684v1 Announce Type: cross Abstract: Federated Clustering (FC) is an emerging and promising solution in exploring data distribution patterns from distributed and privacy-protected data in an unsupervised manner. Existing FC methods implicitly rely on the assumption that clients are with a known number of uniformly sized clusters. However, the true number of clusters is typically […]

RXNRECer Enables Fine-grained Enzymatic Function Annotation through Active Learning and Protein Language Models

arXiv:2603.12694v1 Announce Type: cross Abstract: A key challenge in enzyme annotation is identifying the biochemical reactions catalyzed by proteins. Most existing methods rely on Enzyme Commission (EC) numbers as intermediaries: they first predict an EC number and then retrieve the associated reactions. This indirect strategy introduces ambiguity due to the complex many-to-many mappings among proteins, […]

When Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO

arXiv:2603.13134v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) has emerged as an effective method for training reasoning models. While it computes advantages based on group mean, GRPO treats each output as an independent sample during the optimization and overlooks a vital structural signal: the natural contrast between correct and incorrect solutions within the […]

Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity

arXiv:2603.12707v1 Announce Type: cross Abstract: Multimodal large language model (MLLM) inference splits into two phases with opposing hardware demands: vision encoding is compute-bound, while language generation is memory-bandwidth-bound. We show that under standard transformer KV caching, the modality boundary (between vision encoder and language model) minimizes cross-device transfer among all partition points that preserve standard […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844