arXiv:2603.06863v1 Announce Type: cross Abstract: Trajectory prediction for flying objects is critical in domains ranging from sports analytics to aerospace. However, traditional methods struggle with complex physical modeling, computational inefficiencies, and high hardware demands, often neglecting critical trajectory events like landing points. This paper introduces a novel, hardware-efficient trajectory prediction framework that integrates environmental priors […]
Looking Back and Forth: Cross-Image Attention Calibration and Attentive Preference Learning for Multi-Image Hallucination Mitigation
arXiv:2603.07048v1 Announce Type: cross Abstract: Although large vision-language models (LVLMs) have demonstrated remarkable capabilities, they are prone to hallucinations in multi-image tasks. We attribute this issue to limitations in existing attention mechanisms and insufficient cross-image modeling. Inspired by this, we propose a structured hallucination mitigation framework involving Cross-Image Attention calibration and Preference Learning (CAPL). CAPL […]
Safe Transformer: An Explicit Safety Bit For Interpretable And Controllable Alignment
arXiv:2603.06727v1 Announce Type: cross Abstract: Current safety alignment methods encode safe behavior implicitly within model parameters, creating a fundamental opacity: we cannot easily inspect why a model refuses a request, nor intervene when its safety judgments fail. We propose Safe Transformer, a modular approach that augments pre-trained language models by inserting a discrete information bottleneck […]
Gradient-based Nested Co-Design of Aerodynamic Shape and Control for Winged Robots
arXiv:2603.06760v1 Announce Type: cross Abstract: Designing aerial robots for specialized tasks, from perching to payload delivery, requires tailoring their aerodynamic shape to specific mission requirements. For tasks involving wide flight envelopes, the usual sequential process of first determining the shape and then the motion planner is likely to be suboptimal due to the inherent nonlinear […]
Integration of deep generative Anomaly Detection algorithm in high-speed industrial line
arXiv:2603.07577v1 Announce Type: cross Abstract: Industrial visual inspection in pharmaceutical production requires high accuracy under strict constraints on cycle time, hardware footprint, and operational cost. Manual inline inspection is still common, but it is affected by operator variability and limited throughput. Classical rule-based computer vision pipelines are often rigid and difficult to scale to highly […]
ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework
arXiv:2603.07946v1 Announce Type: cross Abstract: Human mobility generation aims to synthesize plausible trajectory data, which is widely used in urban system research. While Large Language Model-based methods excel at generating routine trajectories, they struggle to capture deviated mobility during large-scale societal events. This limitation stems from two critical gaps: (1) the absence of event-annotated mobility […]
Sparsity and Out-of-Distribution Generalization
arXiv:2603.07388v1 Announce Type: cross Abstract: Explaining out-of-distribution generalization has been a central problem in epistemology since Goodman’s “grue” puzzle in 1946. Today it’s a central problem in machine learning, including AI alignment. Here we propose a principled account of OOD generalization with three main ingredients. First, the world is always presented to experience not as […]
Cross-Modal Taxonomic Generalization in (Vision-) Language Models
arXiv:2603.07474v1 Announce Type: cross Abstract: What is the interplay between semantic representations learned by language models (LM) from surface form alone to those learned from more grounded evidence? We study this question for a scenario where part of the input comes from a different modality — in our case, in a vision-language model (VLM), where […]
Learning embeddings of non-linear PDEs: the Burgers’ equation
arXiv:2603.07812v1 Announce Type: cross Abstract: Embeddings provide low-dimensional representations that organize complex function spaces and support generalization. They provide a geometric representation that supports efficient retrieval, comparison, and generalization. In this work we generalize the concept to Physics Informed Neural Networks. We present a method to construct solution embedding spaces of nonlinear partial differential equations […]
Thinking with Gaze: Sequential Eye-Tracking as Visual Reasoning Supervision for Medical VLMs
arXiv:2603.06697v1 Announce Type: cross Abstract: Vision–language models (VLMs) process images as visual tokens, yet their intermediate reasoning is often carried out in text, which can be suboptimal for visually grounded radiology tasks. Radiologists instead diagnose via sequential visual search; eye-tracking captures this process as time-ordered gaze trajectories that reveal how evidence is acquired over time. […]
Stabilizing Reinforcement Learning for Diffusion Language Models
arXiv:2603.06743v1 Announce Type: cross Abstract: Group Relative Policy Optimization (GRPO) is highly effective for post-training autoregressive (AR) language models, yet its direct application to diffusion large language models (dLLMs) often triggers reward collapse. We identify two sources of incompatibility. First, GRPO relies on importance ratios defined by sequence probabilities, which are intractable in dLLMs and […]
A Hybrid Machine Learning Model for Cerebral Palsy Detection
arXiv:2603.06803v1 Announce Type: cross Abstract: The development of effective treatments for Cerebral Palsy (CP) can begin with the early identification of affected children while they are still in the early stages of the disorder. Pathological issues in the brain can be better diagnosed with the use of one of many medical imaging techniques. Magnetic Resonance […]