arXiv:2506.12104v2 Announce Type: replace-cross Abstract: Large Language Models (LLMs) are increasingly central to agentic systems due to their strong reasoning and planning capabilities. By interacting with external environments through predefined tools, these agents can carry out complex user tasks. Nonetheless, this interaction also introduces the risk of prompt injection attacks, where malicious inputs from external […]
GranViT: A Fine-Grained Vision Model With Autoregressive Perception For MLLMs
arXiv:2510.21501v1 Announce Type: cross Abstract: Vision encoders are indispensable for allowing impressive performance of Multi-modal Large Language Models (MLLMs) in vision language tasks such as visual question answering and reasoning. However, existing vision encoders focus on global image representations but overlook fine-grained regional analysis. They are limited in fine grained perception due to the scarcity […]
BrainCognizer: Brain Decoding with Human Visual Cognition Simulation for fMRI-to-Image Reconstruction
arXiv:2510.20855v1 Announce Type: new Abstract: Brain decoding is a key neuroscience field that reconstructs the visual stimuli from brain activity with fMRI, which helps illuminate how the brain represents the world. fMRI-to-image reconstruction has achieved impressive progress by leveraging diffusion models. However, brain signals infused with prior knowledge and associations exhibit a significant information asymmetry […]
Few-Shot Knowledge Distillation of LLMs With Counterfactual Explanations
arXiv:2510.21631v1 Announce Type: cross Abstract: Knowledge distillation is a promising approach to transfer capabilities from complex teacher models to smaller, resource-efficient student models that can be deployed easily, particularly in task-aware scenarios. However, existing methods of task-aware distillation typically require substantial quantities of data which may be unavailable or expensive to obtain in many practical […]
CLASP: Adaptive Spectral Clustering for Unsupervised Per-Image Segmentation
arXiv:2509.25016v2 Announce Type: replace-cross Abstract: We introduce CLASP (Clustering via Adaptive Spectral Processing), a lightweight framework for unsupervised image segmentation that operates without any labeled data or finetuning. CLASP first extracts per patch features using a self supervised ViT encoder (DINO); then, it builds an affinity matrix and applies spectral clustering. To avoid manual tuning, […]
Enhancing Interpretability in Deep Reinforcement Learning through Semantic Clustering
arXiv:2409.17411v5 Announce Type: replace Abstract: In this paper, we explore semantic clustering properties of deep reinforcement learning (DRL) to improve its interpretability and deepen our understanding of its internal semantic organization. In this context, semantic clustering refers to the ability of neural networks to cluster inputs based on their semantic similarity in the feature space. […]
Vision-language models learn the geometry of human perceptual space
arXiv:2510.20859v1 Announce Type: new Abstract: In cognitive science and AI, a longstanding question is whether machines learn representations that align with those of the human mind. While current models show promise, it remains an open question whether this alignment is superficial or reflects a deeper correspondence in the underlying dimensions of representation. Here we introduce […]
MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?
arXiv:2504.09702v3 Announce Type: replace Abstract: We introduce MLRC-Bench, a benchmark designed to quantify how effectively language agents can tackle challenging Machine Learning (ML) Research Competitions, with a focus on open research problems that demand novel methodologies. Unlike prior work, e.g., AI Scientist, which evaluates the end-to-end agentic pipeline by using LLM-as-a-judge, MLRC-Bench measures the key […]
Mitigating Manipulation and Enhancing Persuasion: A Reflective Multi-Agent Approach for Legal Argument Generation
arXiv:2506.02992v2 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly explored for legal argument generation, yet they pose significant risks of manipulation through hallucination and ungrounded persuasion, and often fail to utilize provided factual bases effectively or abstain when arguments are untenable. This paper introduces a novel reflective multi-agent method designed to address these […]
AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
arXiv:2510.21031v1 Announce Type: cross Abstract: The emergence of foundation models (FMs) has enabled the development of highly capable and autonomous agents, unlocking new application opportunities across a wide range of domains. Evaluating the architecture of agents is particularly important as the architectural decisions significantly impact the quality attributes of agents given their unique characteristics, including […]