Accelerating Large-Scale Cheminformatics Using a Byte-Offset Indexing Architecture for Terabyte-Scale Data Integration

arXiv:2601.18921v2 Announce Type: replace-cross Abstract: The integration of large-scale chemical databases represents a critical bottleneck in modern cheminformatics research, particularly for machine learning applications requiring high-quality, multi-source validated datasets. This paper presents a case study of integrating three major public chemical repositories: PubChem (176 million compounds), ChEMBL, and eMolecules, to construct a curated dataset for […]

Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision

arXiv:2603.19807v1 Announce Type: cross Abstract: Unified Multimodal Models (UMMs) have emerged as a promising paradigm that integrates multimodal understanding and generation within a unified modeling framework. However, current generative training paradigms suffer from inherent limitations. We present Semantically-Grounded Supervision (SeGroS), a fine-tuning framework designed to resolve the granularity mismatch and supervisory redundancy in UMMs. At […]

Dual Prompt-Driven Feature Encoding for Nighttime UAV Tracking

arXiv:2603.19628v1 Announce Type: cross Abstract: Robust feature encoding constitutes the foundation of UAV tracking by enabling the nuanced perception of target appearance and motion, thereby playing a pivotal role in ensuring reliable tracking. However, existing feature encoding methods often overlook critical illumination and viewpoint cues, which are essential for robust perception under challenging nighttime conditions, […]

Revealing Domain-Spatiality Patterns for Configuration Tuning: Domain Knowledge Meets Fitness Landscapes

arXiv:2603.19897v1 Announce Type: cross Abstract: Configuration tuning for better performance is crucial in quality assurance. Yet, there has long been a mystery on tuners’ effectiveness, due to the black-box nature of configurable systems. Prior efforts predominantly adopt static domain analysis (e.g., static taint analysis), which often lacks generalizability, or dynamic data analysis (e.g., benchmarking performance […]

A Multi-Perspective Benchmark and Moderation Model for Evaluating Safety and Adversarial Robustness

arXiv:2601.03273v2 Announce Type: replace-cross Abstract: As large language models (LLMs) become deeply embedded in daily life, the urgent need for safer moderation systems that distinguish between naive and harmful requests while upholding appropriate censorship boundaries has never been greater. While existing LLMs can detect dangerous or unsafe content, they often struggle with nuanced cases such […]

X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

arXiv:2603.19979v1 Announce Type: cross Abstract: Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision–language–action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still rely heavily on real-world road testing, which is costly, biased toward limited scenario coverage, and difficult to reproduce. These […]

RobotArena $infty$: Scalable Robot Benchmarking via Real-to-Sim Translation

arXiv:2510.23571v3 Announce Type: replace-cross Abstract: The pursuit of robot generalists, agents capable of performing diverse tasks across diverse environments, demands rigorous and scalable evaluation. Yet real-world testing of robot policies remains fundamentally constrained: it is labor-intensive, slow, unsafe at scale, and difficult to reproduce. As policies expand in scope and complexity, these barriers only intensify, […]

Fine-tuning Timeseries Predictors Using Reinforcement Learning

arXiv:2603.20063v1 Announce Type: cross Abstract: This chapter presents three major reinforcement learning algorithms used for fine-tuning financial forecasters. We propose a clear implementation plan for backpropagating the loss of a reinforcement learning task to a model trained using supervised learning, and compare the performance before and after the fine-tuning. We find an increase in performance […]

Spectral Convolution on Orbifolds for Geometric Deep Learning

arXiv:2602.14997v2 Announce Type: replace-cross Abstract: Geometric deep learning (GDL) deals with supervised learning on data domains that go beyond Euclidean structure, such as data with graph or manifold structure. Due to the demand that arises from application-related data, there is a need to identify further topological and geometric structures with which these use cases can […]

Conditioning Protein Generation via Hopfield Pattern Multiplicity

arXiv:2603.20115v1 Announce Type: cross Abstract: Protein sequence generation via stochastic attention produces plausible family members from small alignments without training, but treats all stored sequences equally and cannot direct generation toward a functional subset of interest. We show that a single scalar parameter, added as a bias to the sampler’s attention logits, continuously shifts generation […]

PDDL Axioms Are Equivalent to Least Fixed Point Logic (Extended Version)

arXiv:2510.14412v2 Announce Type: replace Abstract: Axioms are a feature of the Planning Domain Definition Language PDDL that can be considered as a generalization of database query languages such as Datalog. The PDDL standard restricts negative occurrences of predicates in axiom bodies to predicates that are directly set by actions and not derived by axioms. In […]

Measuring Faithfulness Depends on How You Measure: Classifier Sensitivity in LLM Chain-of-Thought Evaluation

arXiv:2603.20172v1 Announce Type: cross Abstract: Recent work on chain-of-thought (CoT) faithfulness reports single aggregate numbers (e.g., DeepSeek-R1 acknowledges hints 39% of the time), implying that faithfulness is an objective, measurable property of a model. This paper demonstrates that it is not. Three classifiers (a regex-only detector, a two-stage regex-plus-LLM pipeline, and an independent Claude Sonnet […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844