Less is More for RAG: Information Gain Pruning for Generator-Aligned Reranking and Evidence Selection

arXiv:2601.17532v1 Announce Type: cross Abstract: Retrieval-augmented generation (RAG) grounds large language models with external evidence, but under a limited context budget, the key challenge is deciding which retrieved passages should be injected. We show that retrieval relevance metrics (e.g., NDCG) correlate weakly with end-to-end QA quality and can even become negatively correlated under multi-passage injection, […]

A Model-Driven Lossless Compression Algorithm Resistant to Mismatch

arXiv:2601.17684v1 Announce Type: cross Abstract: Due to the fundamental connection between next-symbol prediction and compression, modern predictive models, such as large language models (LLMs), can be combined with entropy coding to achieve compression rates that surpass those of standard compression algorithms. However, this approach relies on the assumption that the predictive model produces identical output […]

Dynamic Meta-Ensemble Framework for Efficient and Accurate Deep Learning in Plant Leaf Disease Detection on Resource-Constrained Edge Devices

arXiv:2601.17290v1 Announce Type: cross Abstract: Deploying deep learning models for plant disease detection on edge devices such as IoT sensors, smartphones, and embedded systems is severely constrained by limited computational resources and energy budgets. To address this challenge, we introduce a novel Dynamic Meta-Ensemble Framework (DMEF) for high-accuracy plant disease diagnosis under resource constraints. DMEF […]

ONRW: Optimizing inversion noise for high-quality and robust watermark

arXiv:2601.17388v1 Announce Type: cross Abstract: Watermarking methods have always been effective means of protecting intellectual property, yet they face significant challenges. Although existing deep learning-based watermarking systems can hide watermarks in images with minimal impact on image quality, they often lack robustness when encountering image corruptions during transmission, which undermines their practical application value. To […]

The Limits of AI Data Transparency Policy: Three Disclosure Fallacies

arXiv:2601.18127v1 Announce Type: cross Abstract: Data transparency has emerged as a rallying cry for addressing concerns about AI: data quality, privacy, and copyright chief among them. Yet while these calls are crucial for accountability, current transparency policies often fall short of their intended aims. Similar to nutrition facts for food, policies aimed at nutrition facts […]

Beyond Instrumental and Substitutive Paradigms: Introducing Machine Culture as an Emergent Phenomenon in Large Language Models

arXiv:2601.17096v1 Announce Type: cross Abstract: Recent scholarship typically characterizes Large Language Models (LLMs) through either an textitInstrumental Paradigm (viewing models as reflections of their developers’ culture) or a textitSubstitutive Paradigm (viewing models as bilingual proxies that switch cultural frames based on language). This study challenges these anthropomorphic frameworks by proposing textbfMachine Culture as an emergent, […]

VIBEVOICE-ASR Technical Report

arXiv:2601.18184v1 Announce Type: cross Abstract: This report presents VibeVoice-ASR, a general-purpose speech understanding framework built upon VibeVoice, designed to address the persistent challenges of context fragmentation and multi-speaker complexity in long-form audio (e.g., meetings, podcasts) that remain despite recent advancements in short-form speech recognition. Unlike traditional pipelined approaches that rely on audio chunking, VibeVoice-ASRsupports single-pass […]

Robust Privacy: Inference-Time Privacy through Certified Robustness

arXiv:2601.17360v1 Announce Type: cross Abstract: Machine learning systems can produce personalized outputs that allow an adversary to infer sensitive input attributes at inference time. We introduce Robust Privacy (RP), an inference-time privacy notion inspired by certified robustness: if a model’s prediction is provably invariant within a radius-$R$ neighborhood around an input $x$ (e.g., under the […]

Embodiment-Induced Coordination Regimes in Tabular Multi-Agent Q-Learning

arXiv:2601.17454v1 Announce Type: cross Abstract: Centralized value learning is often assumed to improve coordination and stability in multi-agent reinforcement learning, yet this assumption is rarely tested under controlled conditions. We directly evaluate it in a fully tabular predator-prey gridworld by comparing independent and centralized Q-learning under explicit embodiment constraints on agent speed and stamina. Across […]

Prompt Driven Development with Claude Code: Building a Complete TUI Framework for the Ring Programming Language

arXiv:2601.17584v1 Announce Type: cross Abstract: Large language models are increasingly used in software development, yet their ability to generate and maintain large, multi module systems through natural language interaction remains insufficiently characterized. This study presents an empirical analysis of developing a 7420 line Terminal User Interface framework for the Ring programming language, completed in roughly […]

AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

arXiv:2601.17761v1 Announce Type: cross Abstract: Real-world perception and interaction are inherently multimodal, encompassing not only language but also vision and speech, which motivates the development of “Omni” MLLMs that support both multimodal inputs and multimodal outputs. While a sequence of omni MLLMs has emerged, most existing systems still rely on additional expert components to achieve […]

Feature-Space Generative Models for One-Shot Class-Incremental Learning

arXiv:2601.17905v1 Announce Type: cross Abstract: Few-shot class-incremental learning (FSCIL) is a paradigm where a model, initially trained on a dataset of base classes, must adapt to an expanding problem space by recognizing novel classes with limited data. We focus on the challenging FSCIL setup where a model receives only a single sample (1-shot) for each […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844