Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

arXiv:2603.09416v1 Announce Type: cross Abstract: Large Language Models (LLMs) excel in Natural Language Processing (NLP) tasks, but they often propagate biases embedded in their training data, which is potentially impactful in sensitive domains like healthcare. While existing benchmarks evaluate biases related to individual social determinants of health (SDoH) such as gender or ethnicity, they often […]

TaoSR1: The Thinking Model for E-commerce Relevance Search

arXiv:2508.12365v4 Announce Type: replace-cross Abstract: Query-product relevance prediction is a core task in e-commerce search. BERT-based models excel at semantic matching but lack complex reasoning capabilities. While Large Language Models (LLMs) are explored, most still use discriminative fine-tuning or distill to smaller models for deployment. We propose a framework to directly deploy LLMs for this […]

Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

arXiv:2509.14093v2 Announce Type: replace-cross Abstract: Chain-of-Thought (CoT) reasoning enhances Large Language Models (LLMs) by prompting intermediate steps, improving accuracy and robustness in arithmetic, logic, and commonsense tasks. However, this benefit comes with high computational costs: longer outputs increase latency, memory usage, and KV-cache demands. These issues are especially critical in software engineering tasks where concise […]

Reviving ConvNeXt for Efficient Convolutional Diffusion Models

arXiv:2603.09408v1 Announce Type: cross Abstract: Recent diffusion models increasingly favor Transformer backbones, motivated by the remarkable scalability of fully attentional architectures. Yet the locality bias, parameter efficiency, and hardware friendliness–the attributes that established ConvNets as the efficient vision backbone–have seen limited exploration in modern generative modeling. Here we introduce the fully convolutional diffusion model (FCDM), […]

Latent Generative Models with Tunable Complexity for Compressed Sensing and other Inverse Problems

arXiv:2603.07357v2 Announce Type: replace-cross Abstract: Generative models have emerged as powerful priors for solving inverse problems. These models typically represent a class of natural signals using a single fixed complexity or dimensionality. This can be limiting: depending on the problem, a fixed complexity may result in high representation error if too small, or overfitting to […]

PromptDLA: A Domain-aware Prompt Document Layout Analysis Framework with Descriptive Knowledge as a Cue

arXiv:2603.09414v1 Announce Type: cross Abstract: Document Layout Analysis (DLA) is crucial for document artificial intelligence and has recently received increasing attention, resulting in an influx of large-scale public DLA datasets. Existing work often combines data from various domains in recent public DLA datasets to improve the generalization of DLA. However, directly merging these datasets for […]

IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation

arXiv:2603.07926v2 Announce Type: replace-cross Abstract: Test-time adaptation (TTA) has been widely explored to prevent performance degradation when test data differ from the training distribution. However, fully leveraging the rich representations of large pretrained models with minimal parameter updates remains underexplored. In this paper, we propose Intrinsic Mixture of Spectral Experts (IMSE) that leverages the spectral […]

Hydrodynamic origins of symmetric swimming strategies

arXiv:2603.08444v2 Announce Type: replace-cross Abstract: Efficient locomotion is important for the evolution of complex life, yet the physical principles selecting specific swimming strokes often remain entangled with biological constraints. In viscous fluids, the scallop theorem constrains the temporal organization of strokes, but no analogous principle is known for their spatial structure, leaving the prevalence of […]

From Flow to One Step: Real-Time Multi-Modal Trajectory Policies via Implicit Maximum Likelihood Estimation-based Distribution Distillation

arXiv:2603.09415v1 Announce Type: cross Abstract: Generative policies based on diffusion and flow matching achieve strong performance in robotic manipulation by modeling multi-modal human demonstrations. However, their reliance on iterative Ordinary Differential Equation (ODE) integration introduces substantial latency, limiting high-frequency closed-loop control. Recent single-step acceleration methods alleviate this overhead but often exhibit distributional collapse, producing averaged […]

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

arXiv:2603.08640v2 Announce Type: replace-cross Abstract: AI agents have become surprisingly proficient at software engineering over the past year, largely due to improvements in reasoning capabilities. This raises a deeper question: can these systems extend their capabilities to automate AI research itself? In this paper, we explore post-training, the critical phase that turns base LLMs into […]

UAT-LITE: Inference-Time Uncertainty-Aware Attention for Pretrained Transformers

arXiv:2602.02952v2 Announce Type: replace Abstract: Neural NLP models are often miscalibrated and overconfident, assigning high confidence to incorrect predictions and failing to express uncertainty during internal evidence aggregation. This undermines selective prediction and high-stakes deployment. Post-hoc calibration methods adjust output probabilities but leave internal computation unchanged, while ensemble and Bayesian approaches improve uncertainty at substantial […]

ICDAR 2025 Competition on End-to-End Document Image Machine Translation Towards Complex Layouts

arXiv:2603.09392v1 Announce Type: cross Abstract: Document Image Machine Translation (DIMT) seeks to translate text embedded in document images from one language to another by jointly modeling both textual content and page layout, bridging optical character recognition (OCR) and natural language processing (NLP). The DIMT 2025 Challenge advances research on end-to-end document image translation, a rapidly […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844