Stronger Normalization-Free Transformers

arXiv:2512.10938v2 Announce Type: replace-cross Abstract: Although normalization layers have long been viewed as indispensable components of deep learning architectures, the recent introduction of Dynamic Tanh (DyT) has demonstrated that alternatives are possible. The point-wise function DyT constrains extreme values for stable convergence and reaches normalization-level performance; this work seeks further for function designs that can […]

Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification

arXiv:2603.29537v1 Announce Type: cross Abstract: Network traffic classification using self-supervised pre-training models based on Masked Autoencoders (MAE) has demonstrated a huge potential. However, existing methods are confined to isolated byte-level reconstruction of individual flows, lacking adequate perception of the multi-granularity contextual relationship in traffic. To address this limitation, we propose Mean MAE (MMAE), a teacher-student […]

DGPO: RL-Steered Graph Diffusion for Neural Architecture Generation

arXiv:2602.19261v2 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning has proven effective for steering generative diffusion models toward desired properties in image and molecular domains. Graph diffusion models have similarly been applied to combinatorial structure generation, including neural architecture search (NAS). However, neural architectures are directed acyclic graphs (DAGs) where edge direction encodes functional semantics such […]

X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

arXiv:2603.19979v2 Announce Type: replace-cross Abstract: Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision–language–action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still rely heavily on real-world road testing, which is costly, biased toward limited scenario coverage, and difficult to reproduce. These […]

FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients

arXiv:2603.19722v2 Announce Type: replace-cross Abstract: Federated learning (FL) suffers from performance degradation due to the inevitable presence of noisy annotations in distributed scenarios. Existing approaches have advanced in distinguishing noisy samples from the dataset for label correction by leveraging loss values. However, noisy samples recognition relying on scalar loss lacks reliability for FL under heterogeneous […]

Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge

arXiv:2603.29535v1 Announce Type: cross Abstract: Generative Artificial Intelligence (GenAI) features such as image editing, object removal, and prompt-guided image transformation are increasingly integrated into mobile applications. However, deploying Large Vision Models (LVMs) for such tasks on resource-constrained devices remains challenging due to their high memory and compute requirements. While Low-Rank Adapters (LoRAs) enable parameter-efficient task […]

Inducing Sustained Creativity and Diversity in Large Language Models

arXiv:2603.19519v2 Announce Type: replace-cross Abstract: We address a not-widely-recognized subset of exploratory search, where a user sets out on a typically long “search quest” for the perfect wedding dress, overlooked research topic, killer company idea, etc. The first few outputs of current large language models (LLMs) may be helpful but only as a start, since […]

Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning

arXiv:2603.29677v1 Announce Type: cross Abstract: Multimodal learning enables neural networks to integrate information from heterogeneous sources, but active learning in this setting faces distinct challenges. These include missing modalities, differences in modality difficulty, and varying interaction structures. These are issues absent in the unimodal case. While the behavior of active learning strategies in unimodal settings […]

Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction

arXiv:2603.29529v1 Announce Type: cross Abstract: We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynamics to characterize the low-loss manifold and understand the mechanisms underlying the superior performance of transformers in protein structure prediction. We find that, […]

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA

arXiv:2603.29844v1 Announce Type: cross Abstract: The development of Vision-Language-Action (VLA) models has been significantly accelerated by pre-trained Vision-Language Models (VLMs). However, most existing end-to-end VLAs treat the VLM primarily as a multimodal encoder, directly mapping vision-language features to low-level actions. This paradigm underutilizes the VLM’s potential in high-level decision making and introduces training instability, frequently […]

How do LLMs Compute Verbal Confidence

arXiv:2603.17839v2 Announce Type: replace-cross Abstract: Verbal confidence — prompting LLMs to state their confidence as a number or category — is widely used to extract uncertainty estimates from black-box models. However, how LLMs internally generate such scores remains unknown. We address two questions: first, when confidence is computed – just-in-time when requested, or automatically during […]

Interview-Informed Generative Agents for Product Discovery: A Validation Study

arXiv:2603.29890v1 Announce Type: cross Abstract: Large language models (LLMs) have shown strong performance on standardized social science instruments, but their value for product discovery remains unclear. We investigate whether interview-informed generative agents can simulate user responses in concept testing scenarios. Using in-depth workflow interviews with knowledge workers, we created personalized agents and compared their evaluations […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844