arXiv:2506.12072v4 Announce Type: replace-cross Abstract: In an era of AI-generated misinformation flooding the web, existing tools struggle to empower users with nuanced, transparent assessments of content credibility. They often default to binary (true/false) classifications without contextual justifications, leaving users vulnerable to disinformation. We address this gap by introducing TRACE: Transparent Reliability Assessment with Contextual Explanations, […]
SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dynamic Threat Detection
arXiv:2510.16219v3 Announce Type: replace-cross Abstract: Malicious agents pose significant threats to the reliability and decision-making capabilities of Multi-Agent Systems (MAS) powered by Large Language Models (LLMs). Existing defenses often fall short due to reactive designs or centralized architectures which may introduce single points of failure. To address these challenges, we propose SentinelNet, the first decentralized […]
Towards Faithful Reasoning in Comics for Small MLLMs
arXiv:2601.02991v2 Announce Type: replace-cross Abstract: Comic understanding presents a significant challenge for Multimodal Large Language Models (MLLMs), as the intended meaning of a comic often emerges from the joint interpretation of visual, textual, and social cues. This naturally motivates Chain-of-Thought (CoT) prompting, since explicit intermediate reasoning appears promising for integrating such heterogeneous signals. However, existing […]
Machine Learning for Network Attacks Classification and Statistical Evaluation of Adversarial Learning Methodologies for Synthetic Data Generation
arXiv:2603.17717v2 Announce Type: replace-cross Abstract: Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more sophisticated attacks that utilize advanced techniques, such as generative artificial intelligence (GenAI) and reinforcement learning, it has become a vital component […]
Learning to Play Blackjack: A Curriculum Learning Perspective
arXiv:2604.00076v2 Announce Type: replace-cross Abstract: Reinforcement Learning (RL) agents often struggle with efficiency and performance in complex environments. We propose a novel framework that uses a Large Language Model (LLM) to dynamically generate a curriculum over available actions, enabling the agent to incorporate each action individually. We apply this framework to the game of Blackjack, […]
Captioning Daily Activity Images in Early Childhood Education: Benchmark and Algorithm
arXiv:2604.01941v1 Announce Type: cross Abstract: Image captioning for Early Childhood Education (ECE) is essential for automated activity understanding and educational assessment. However, existing methods face two key challenges. First, the lack of large-scale, domain-specific datasets limits the model’s ability to capture fine-grained semantic concepts unique to ECE scenarios, resulting in generic and imprecise descriptions. Second, […]
SAFE: Stepwise Atomic Feedback for Error correction in Multi-hop Reasoning
arXiv:2604.01993v1 Announce Type: cross Abstract: Multi-hop QA benchmarks frequently reward Large Language Models (LLMs) for spurious correctness, masking ungrounded or flawed reasoning steps. To shift toward rigorous reasoning, we propose SAFE, a dynamic benchmarking framework that replaces the ungrounded Chain-of-Thought (CoT) with a strictly verifiable sequence of grounded entities. Our framework operates across two phases: […]
Optimizing RAG Rerankers with LLM Feedback via Reinforcement Learning
arXiv:2604.02091v1 Announce Type: cross Abstract: Rerankers play a pivotal role in refining retrieval results for Retrieval-Augmented Generation. However, current reranking models are typically optimized on static human annotated relevance labels in isolation, decoupled from the downstream generation process. This isolation leads to a fundamental misalignment: documents identified as topically relevant by information retrieval metrics often […]
Universal Hypernetworks for Arbitrary Models
arXiv:2604.02215v1 Announce Type: cross Abstract: Conventional hypernetworks are typically engineered around a specific base-model parameterization, so changing the target architecture often entails redesigning the hypernetwork and retraining it from scratch. We introduce the emphUniversal Hypernetwork (UHN), a fixed-architecture generator that predicts weights from deterministic parameter, architecture, and task descriptors. This descriptor-based formulation decouples the generator […]
Batched Contextual Reinforcement: A Task-Scaling Law for Efficient Reasoning
arXiv:2604.02322v1 Announce Type: cross Abstract: Large Language Models employing Chain-of-Thought reasoning achieve strong performance but suffer from excessive token consumption that inflates inference costs. Existing efficiency methods such as explicit length penalties, difficulty estimators, or multi-stage curricula either degrade reasoning quality or require complex training pipelines. We introduce Batched Contextual Reinforcement, a minimalist, single-stage training […]
Cardiac-Phase-Dependent Spin Coherence as a Probe of Boundary Covariance Geometry in Neural Tissue
arXiv:2505.22680v2 Announce Type: replace Abstract: A recently proposed geometric framework predicts that the transition from distributed belief to committed action involves a metric regime change, culminating in a boundary regime where cross-mode structure becomes algebraically necessary for continued state-space compression. This paper examines whether reported magnetic resonance measurements of proton spins in neural tissue provide […]
Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding
arXiv:2604.00528v2 Announce Type: replace-cross Abstract: 3D Visual Grounding (3D-VG) aims to localize objects in 3D scenes via natural language descriptions. While recent advancements leveraging Vision-Language Models (VLMs) have explored zero-shot possibilities, they typically suffer from a static workflow relying on preprocessed 3D point clouds, essentially degrading grounding into proposal matching. To bypass this reliance, our […]