Training Language Agents to Learn from Experience

arXiv:2605.20477v1 Announce Type: cross Abstract: Language agents can adapt from experience in interactive environments, but current reflection-based methods can only self-correct within a single task

ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning

arXiv:2605.20385v1 Announce Type: cross Abstract: Recent progress in promptable segmentation has shifted visual perception from object-level localization toward concept-level understanding. However, the notion of a

Modeling Emotional Dynamics in Agent-to-Agent Interactions on Moltbook

arXiv:2605.20442v1 Announce Type: cross Abstract: Generative AI systems are increasingly deployed as interactive agents in online environments, such as a social network called Moltbook. In

VISTA: Technical Report for the Ego4D Short-Term Object Interaction Anticipation at EgoVis 2026

arXiv:2605.20901v1 Announce Type: cross Abstract: We propose VISTA, a V-JEPA Integrated StillFast Temporal Anticipator for the Ego4D Short-Term Object Interaction Anticipation (STA) Challenge at EgoVis

Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs

arXiv:2605.21027v1 Announce Type: cross Abstract: Enterprise analytics aims to make organizational data accessible for decision-making, yet non-technical users still face barriers when using traditional business

AMAR: Lightweight Attention-Based Multi-User Activity Recognition from Wi-Fi CSI

May 21, 2026

arXiv:2605.20649v1 Announce Type: cross
Abstract: Wi-Fi-based human activity recognition (HAR) has emerged as a promising approach for contactless sensing, leveraging channel state information (CSI) collected from wireless transceivers. While existing studies have primarily concentrated on single-user scenarios, real-world deployments often involve multi-user settings where concurrent users’ movements induce overlapping CSI patterns that challenge conventional classification methods. To address this limitation, this paper introduces an attention-based multi-user activity recognition (AMAR) framework that formulates HAR as a set prediction problem. The transformer-based architecture in AMAR leverages learnable query embeddings acting as specialized activity detectors, enabling the simultaneous identification of multiple activities from composite CSI representations. Moreover, to address deployment constraints, AMAR is designed in an edge-cloud split architecture form where lightweight convolutional networks on edge devices perform initial feature extraction, followed by residual vector quantization that achieves substantial bandwidth reduction while preserving activity-discriminative information. The cloud component performs final activity prediction through attention-based set matching, enabling the system to handle varying occupancy levels. Across classroom, meeting-room, and empty-room environments, on average AMAR nearly doubles the rate of perfectly predicting all concurrent activities compared to the best baseline. Moreover, it achieves an $F_1$-score of 53.4% compared to 45.6% for the best benchmark, and reduces occupancy estimation error by 74%, while minimizing bandwidth substantially.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844