Behavior change beyond intervention: an activity-theoretical perspective on human-centered design of personal health technology

IntroductionModern personal technologies, such as smartphone apps with artificial intelligence (AI) capabilities, have a significant potential for helping people make necessary changes in their behavior

A data-centric perspective on designing AI foundation models for healthcare

Post Content

Disclosure in the era of generative artificial intelligence

Generative artificial intelligence (AI) has rapidly become embedded in academic writing, assisting with tasks ranging from language editing to drafting text and producing evidence. Despite

When Policies Cannot Be Retrained: A Unified Closed-Form View of Post-Training Steering in Offline Reinforcement Learning

arXiv:2604.22873v1 Announce Type: cross Abstract: Offline reinforcement learning (RL) can learn effective policies from fixed datasets, but deployment objectives may change after training, and in

Utility-Aware Data Pricing: Token-Level Quality and Empirical Training Gain for LLMs

arXiv:2604.22893v1 Announce Type: cross Abstract: Traditional data valuation methods based on “row-count $times$ quality coefficient” paradigms fail to capture the nuanced, nonlinear contributions that data

HiRO-Nav: Hybrid ReasOning Enables Efficient Embodied Navigation

April 10, 2026

arXiv:2604.08232v1 Announce Type: new
Abstract: Embodied navigation agents built upon large reasoning models (LRMs) can handle complex, multimodal environmental input and perform grounded reasoning per step to improve sequential decision-making for long-horizon tasks. However, a critical question remains: textithow can the reasoning capabilities of LRMs be harnessed intelligently and efficiently for long-horizon navigation tasks? In simple scenes, agents are expected to act reflexively, while in complex ones they should engage in deliberate reasoning before acting.To achieve this, we introduce textbfHybrtextbfid textbfReastextbfOning textbfNavigation (textbfHiRO-Nav) agent, the first kind of agent capable of adaptively determining whether to perform thinking at every step based on its own action entropy. Specifically, by examining how the agent’s action entropy evolves over the navigation trajectories, we observed that only a small fraction of actions exhibit high entropy, and these actions often steer the agent toward novel scenes or critical objects. Furthermore, studying the relationship between action entropy and task completion (i.e., Q-value) reveals that improving high-entropy actions contributes more positively to task success.Hence, we propose a tailored training pipeline comprising hybrid supervised fine-tuning as a cold start, followed by online reinforcement learning with the proposed hybrid reasoning strategy to explicitly activate reasoning only for high-entropy actions, significantly reducing computational overhead while improving decision quality. Extensive experiments on the textscCHORES-$mathbbS$ ObjectNav benchmark showcases that HiRO-Nav achieves a better trade-off between success rates and token efficiency than both dense-thinking and no-thinking baselines.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844