arXiv:2604.06256v1 Announce Type: cross Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions — the spectral edge — which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation probing, sparse autoencoders) fail to capture these directions: their structure is not localized in parameter […]
ClawLess: A Security Model of AI Agents
arXiv:2604.06284v1 Announce Type: cross Abstract: Autonomous AI agents powered by Large Language Models can reason, plan, and execute complex tasks, but their ability to autonomously retrieve information and run code introduces significant security risks. Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. We present ClawLess, […]
Quantitative Estimation of Target Task Performance from Unsupervised Pretext Task in Semi/Self-Supervised Learning
arXiv:2508.07299v2 Announce Type: replace-cross Abstract: The effectiveness of unlabeled data in Semi/Self-Supervised Learning (SSL) depends on appropriate assumptions for specific scenarios, thereby enabling the selection of beneficial unsupervised pretext tasks. However, existing research has paid limited attention to assumptions in SSL, resulting in practical situations where the compatibility between the unsupervised pretext tasks and the […]
LAsset: An LLM-assisted Security Asset Identification Framework for System-on-Chip (SoC) Verification
arXiv:2601.02624v2 Announce Type: replace-cross Abstract: The growing complexity of modern system-on-chip (SoC) and IP designs is making security assurance difficult day by day. One of the fundamental steps in the pre-silicon security verification of a hardware design is the identification of security assets, as it substantially influences downstream security verification tasks, such as threat modeling, […]
Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework
arXiv:2601.22451v2 Announce Type: replace-cross Abstract: Despite progress in Large Vision Language Models (LVLMs), object hallucination remains a critical issue in image captioning task, where models generate descriptions of non-existent objects, compromising their reliability. Previous work attributes this to LVLMs’ over-reliance on language priors and attempts to mitigate it through logits calibration. However, they still lack […]
MedDialBench: Benchmarking LLM Diagnostic Robustness under Parametric Adversarial Patient Behaviors
arXiv:2604.06846v1 Announce Type: cross Abstract: Interactive medical dialogue benchmarks have shown that LLM diagnostic accuracy degrades significantly when interacting with non-cooperative patients, yet existing approaches either apply adversarial behaviors without graded severity or case-specific grounding, or reduce patient non-cooperation to a single ungraded axis, and none analyze cross-dimension interactions. We introduce MedDialBench, a benchmark enabling […]
ATBench: A Diverse and Realistic Agent Trajectory Benchmark for Safety Evaluation and Diagnosis
arXiv:2604.02022v2 Announce Type: replace Abstract: Evaluating the safety of LLM-based agents is increasingly important because risks in realistic deployments often emerge over multi-step interactions rather than isolated prompts or final responses. Existing trajectory-level benchmarks remain limited by insufficient interaction diversity, coarse observability of safety failures, and weak long-horizon realism. We introduce ATBench, a trajectory-level benchmark […]
Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing
arXiv:2509.01986v4 Announce Type: replace-cross Abstract: In recent years, integrating multimodal understanding and generation into a single unified model has emerged as a promising paradigm. While this approach achieves strong results in text-to-image (T2I) generation, it still struggles with precise image editing. We attribute this limitation to an imbalanced division of responsibilities. The understanding module primarily […]
Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence
arXiv:2603.28906v2 Announce Type: replace Abstract: AGI has become the Holly Grail of AI with the promise of level intelligence and the major Tech companies around the world are investing unprecedented amounts of resources in its pursuit. Yet, there does not exist a single formal definition and only some empirical AGI benchmarking frameworks currently exist. The […]
Chatbot-Based Assessment of Code Understanding in Automated Programming Assessment Systems
arXiv:2604.07304v1 Announce Type: cross Abstract: Large Language Models (LLMs) challenge conventional automated programming assessment because students can now produce functionally correct code without demonstrating corresponding understanding. This paper makes two contributions. First, it reports a saturation-based scoping review of conversational assessment approaches in programming education. The review identifies three dominant architectural families: rule-based or template-driven […]
Region-R1: Reinforcing Query-Side Region Cropping for Multi-Modal Re-Ranking
arXiv:2604.05268v2 Announce Type: replace-cross Abstract: Multi-modal retrieval-augmented generation (MM-RAG) relies heavily on re-rankers to surface the most relevant evidence for image-question queries. However, standard re-rankers typically process the full query image as a global embedding, making them susceptible to visual distractors (e.g., background clutter) that skew similarity scores. We propose Region-R1, a query-side region cropping […]
CADENCE: Context-Adaptive Depth Estimation for Navigation and Computational Efficiency
arXiv:2604.07286v1 Announce Type: cross Abstract: Autonomous vehicles deployed in remote environments typically rely on embedded processors, compact batteries, and lightweight sensors. These hardware limitations conflict with the need to derive robust representations of the environment, which often requires executing computationally intensive deep neural networks for perception. To address this challenge, we present CADENCE, an adaptive […]