Online Risk-Averse Planning in POMDPs Using Iterated CVaR Value Function

arXiv:2601.20554v1 Announce Type: new Abstract: We study risk-sensitive planning under partial observability using the dynamic risk measure Iterated Conditional Value-at-Risk (ICVaR). A policy evaluation algorithm for ICVaR is developed with finite-time performance guarantees that do not depend on the cardinality of the action space. Building on this foundation, three widely used online planning algorithms–Sparse Sampling, […]

Conditional PED-ANOVA: Hyperparameter Importance in Hierarchical & Dynamic Search Spaces

arXiv:2601.20800v1 Announce Type: cross Abstract: We propose conditional PED-ANOVA (condPED-ANOVA), a principled framework for estimating hyperparameter importance (HPI) in conditional search spaces, where the presence or domain of a hyperparameter can depend on other hyperparameters. Although the original PED-ANOVA provides a fast and efficient way to estimate HPI within the top-performing regions of the search […]

Dialogical Reasoning Across AI Architectures: A Multi-Model Framework for Testing AI Alignment Strategies

arXiv:2601.20604v1 Announce Type: new Abstract: This paper introduces a methodological framework for empirically testing AI alignment strategies through structured multi-model dialogue. Drawing on Peace Studies traditions – particularly interest-based negotiation, conflict transformation, and commons governance – we operationalize Viral Collaborative Wisdom (VCW), an approach that reframes alignment from a control problem to a relationship problem […]

Post-Training Fairness Control: A Single-Train Framework for Dynamic Fairness in Recommendation

arXiv:2601.20848v1 Announce Type: cross Abstract: Despite growing efforts to mitigate unfairness in recommender systems, existing fairness-aware methods typically fix the fairness requirement at training time and provide limited post-training flexibility. However, in real-world scenarios, diverse stakeholders may demand differing fairness requirements over time, so retraining for different fairness requirements becomes prohibitive. To address this limitation, […]

Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation

arXiv:2601.20614v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) offers a robust mechanism for enhancing mathematical reasoning in large models. However, we identify a systematic lack of emphasis on more challenging questions in existing methods from both algorithmic and data perspectives, despite their importance for refining underdeveloped capabilities. Algorithmically, widely used Group Relative […]

Beyond Syntax: Action Semantics Learning for App Agents

arXiv:2506.17697v2 Announce Type: replace Abstract: The recent development of Large Language Models (LLMs) enables the rise of App agents that interpret user intent and operate smartphone Apps through actions such as clicking and scrolling. While prompt-based solutions with proprietary LLM APIs show promising ability, they incur heavy compute costs and external API dependency. Fine-tuning smaller […]

Investigating the Development of Task-Oriented Communication in Vision-Language Models

arXiv:2601.20641v1 Announce Type: new Abstract: We investigate whether emphLLM-based agents can develop task-oriented communication protocols that differ from standard natural language in collaborative reasoning tasks. Our focus is on two core properties such task-oriented protocols may exhibit: Efficiency — conveying task-relevant information more concisely than natural language, and Covertness — becoming difficult for external observers […]

MetaVLA: Unified Meta Co-training For Efficient Embodied Adaption

arXiv:2510.05580v3 Announce Type: replace Abstract: Vision-Language-Action (VLA) models show promise in embodied reasoning, yet remain far from true generalists-they often require task-specific fine-tuning, incur high compute costs, and generalize poorly to unseen tasks. We propose MetaVLA, a unified, backbone-agnostic post-training framework for efficient and scalable alignment. MetaVLA introduces Context-Aware Meta Co-Training, which consolidates diverse target […]

Noise-induced excitability: bloom, bust and extirpation in autotoxic population dynamics

arXiv:2601.20670v1 Announce Type: new Abstract: Species populations often modify their environment as they grow. When environmental feedback operates more slowly than population growth, the system can undergo boom-bust dynamics, where the population overshoots its carrying capacity and subsequently collapses. In extreme cases, this collapse leads to total extinction. While deterministic models typically fail to capture […]

SimpleMem: Efficient Lifelong Memory for LLM Agents

arXiv:2601.02553v2 Announce Type: replace Abstract: To support long-term interaction in complex environments, LLM agents require memory systems that manage historical experiences. Existing approaches either retain full interaction histories via passive context extension, leading to substantial redundancy, or rely on iterative reasoning to filter noise, incurring high token costs. To address this challenge, we introduce SimpleMem, […]

Enterprise Resource Planning Using Multi-type Transformers in Ferro-Titanium Industry

arXiv:2601.20696v1 Announce Type: new Abstract: Combinatorial optimization problems such as the Job-Shop Scheduling Problem (JSP) and Knapsack Problem (KP) are fundamental challenges in operations research, logistics, and eterprise resource planning (ERP). These problems often require sophisticated algorithms to achieve near-optimal solutions within practical time constraints. Recent advances in deep learning have introduced transformer-based architectures as […]

UDEEP: Edge-based Computer Vision for In-Situ Underwater Crayfish and Plastic Detection

arXiv:2401.06157v2 Announce Type: replace-cross Abstract: Invasive signal crayfish have a detrimental impact on ecosystems. They spread the fungal-type crayfish plague disease (Aphanomyces astaci) that is lethal to the native white clawed crayfish, the only native crayfish species in Britain. Invasive signal crayfish extensively burrow, causing habitat destruction, erosion of river banks and adverse changes in […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844