arXiv:2510.08005v3 Announce Type: replace-cross Abstract: Traditional bug-tracking systems rely heavily on manual reporting, reproduction, classification, and resolution, involving multiple stakeholders such as end users, customer support, developers, and testers. This division of responsibilities requires substantial coordination and human effort, widens the communication gap between non-technical users and developers, and significantly slows the process from bug […]
LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface
arXiv:2603.22519v2 Announce Type: replace-cross Abstract: Textual Large Language Models (LLMs) provide a simple and familiar interface: a string of text is used for both input and output. However, the information conveyed to an LLM often has a richer structure and semantics, which is not conveyed in a string. For example, most prompts contain both instructions […]
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
arXiv:2512.08829v2 Announce Type: replace-cross Abstract: Vision-Language Models (VLMs) are increasingly tasked with ultra-long multimodal understanding. While linear architectures offer constant computation and memory footprints, they often struggle with high-frequency visual perception compared to standard Transformers. To bridge this gap, we introduce textbfInfiniteVL. We first develop a hybrid base model called textbfInfiniteVL-Base that interleaves a small […]
Reducing Complexity for Quantum Approaches in Train Load Optimization
arXiv:2603.29543v1 Announce Type: cross Abstract: Efficiently planning container loads onto trains is a computationally challenging combinatorial optimization problem, central to logistics and supply chain management. A primary source of this complexity arises from the need to model and reduce rehandle operations-unproductive crane moves required to access blocked containers. Conventional mathematical formulations address this by introducing […]
Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models
arXiv:2602.15772v2 Announce Type: replace-cross Abstract: Current research in multimodal models faces a key challenge where enhancing generative capabilities often comes at the expense of understanding, and vice versa. We analyzed this trade-off and identify the primary cause might be the potential conflict between generation and understanding, which creates a competitive dynamic within the model. To […]
LPNSR: Prior-Enhanced Diffusion Image Super-Resolution via LR-Guided Noise Prediction
arXiv:2603.21045v3 Announce Type: replace-cross Abstract: Diffusion-based image super-resolution (SR), which aims to reconstruct high-resolution (HR) images from corresponding low-resolution (LR) observations, faces a fundamental trade-off between inference efficiency and reconstruction quality. The state-of-the-art residual-shifting diffusion framework achieves efficient 4-step inference, yet suffers from severe performance degradation in compact sampling trajectories. This is mainly attributed to […]
Stronger Normalization-Free Transformers
arXiv:2512.10938v2 Announce Type: replace-cross Abstract: Although normalization layers have long been viewed as indispensable components of deep learning architectures, the recent introduction of Dynamic Tanh (DyT) has demonstrated that alternatives are possible. The point-wise function DyT constrains extreme values for stable convergence and reaches normalization-level performance; this work seeks further for function designs that can […]
Mean Masked Autoencoder with Flow-Mixing for Encrypted Traffic Classification
arXiv:2603.29537v1 Announce Type: cross Abstract: Network traffic classification using self-supervised pre-training models based on Masked Autoencoders (MAE) has demonstrated a huge potential. However, existing methods are confined to isolated byte-level reconstruction of individual flows, lacking adequate perception of the multi-granularity contextual relationship in traffic. To address this limitation, we propose Mean MAE (MMAE), a teacher-student […]
DGPO: RL-Steered Graph Diffusion for Neural Architecture Generation
arXiv:2602.19261v2 Announce Type: replace-cross Abstract: Reinforcement learning fine-tuning has proven effective for steering generative diffusion models toward desired properties in image and molecular domains. Graph diffusion models have similarly been applied to combinatorial structure generation, including neural architecture search (NAS). However, neural architectures are directed acyclic graphs (DAGs) where edge direction encodes functional semantics such […]
X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving
arXiv:2603.19979v2 Announce Type: replace-cross Abstract: Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision–language–action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still rely heavily on real-world road testing, which is costly, biased toward limited scenario coverage, and difficult to reproduce. These […]
FedRG: Unleashing the Representation Geometry for Federated Learning with Noisy Clients
arXiv:2603.19722v2 Announce Type: replace-cross Abstract: Federated learning (FL) suffers from performance degradation due to the inevitable presence of noisy annotations in distributed scenarios. Existing approaches have advanced in distinguishing noisy samples from the dataset for label correction by leveraging loss values. However, noisy samples recognition relying on scalar loss lacks reliability for FL under heterogeneous […]
Quantization with Unified Adaptive Distillation to enable multi-LoRA based one-for-all Generative Vision Models on edge
arXiv:2603.29535v1 Announce Type: cross Abstract: Generative Artificial Intelligence (GenAI) features such as image editing, object removal, and prompt-guided image transformation are increasingly integrated into mobile applications. However, deploying Large Vision Models (LVMs) for such tasks on resource-constrained devices remains challenging due to their high memory and compute requirements. While Low-Rank Adapters (LoRAs) enable parameter-efficient task […]