But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors

arXiv:2505.17760v2 Announce Type: replace-cross Abstract: Detecting subtle forms of dishonesty like sycophancy and manipulation in Large Language Models (LLMs) remains challenging for both humans and automated evaluators, as these behaviors often appear through small biases rather than clear false statements. We introduce Judge Using Safety-Steered Alternatives (JUSSA), a novel framework that employs steering vectors not […]

An LLM-based Framework for Human-Swarm Teaming Cognition in Disaster Search and Rescue

arXiv:2511.04042v1 Announce Type: cross Abstract: Large-scale disaster Search And Rescue (SAR) operations are persistently challenged by complex terrain and disrupted communications. While Unmanned Aerial Vehicle (UAV) swarms offer a promising solution for tasks like wide-area search and supply delivery, yet their effective coordination places a significant cognitive burden on human operators. The core human-machine collaboration […]

Opus: A Quantitative Framework for Workflow Evaluation

arXiv:2511.04220v1 Announce Type: new Abstract: This paper introduces the Opus Workflow Evaluation Framework, a probabilistic-normative formulation for quantifying Workflow quality and efficiency. It integrates notions of correctness, reliability, and cost into a coherent mathematical model that enables direct comparison, scoring, and optimization of Workflows. The framework combines the Opus Workflow Reward, a probabilistic function estimating […]

Left Atrial Segmentation with nnU-Net Using MRI

arXiv:2511.04071v1 Announce Type: cross Abstract: Accurate segmentation of the left atrium (LA) from cardiac MRI is critical for guiding atrial fibrillation (AF) ablation and constructing biophysical cardiac models. Manual delineation is time-consuming, observer-dependent, and impractical for large-scale or time-sensitive clinical workflows. Deep learning methods, particularly convolutional architectures, have recently demonstrated superior performance in medical image […]

Explicit Density Approximation for Neural Implicit Samplers Using a Bernstein-Based Convex Divergence

arXiv:2506.04700v2 Announce Type: replace-cross Abstract: Rank-based statistical metrics, such as the invariant statistical loss (ISL), have recently emerged as robust and practically effective tools for training implicit generative models. In this work, we introduce dual-ISL, a novel likelihood-free objective for training implicit generative models that interchanges the roles of the target and model distributions in […]

Advancing Equitable AI: Evaluating Cultural Expressiveness in LLMs for Latin American Contexts

arXiv:2511.04090v1 Announce Type: cross Abstract: Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced datasets. This paper examines AI representations of diverse Latin American contexts, revealing disparities between data from economically advanced and developing regions. We highlight how the dominance of […]

PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

arXiv:2511.02780v2 Announce Type: replace-cross Abstract: Smart contracts operate in a highly adversarial environment, where vulnerabilities can lead to substantial financial losses. Thus, smart contracts are subject to security audits. In auditing, proof-of-concept (PoC) exploits play a critical role by demonstrating to the stakeholders that the reported vulnerabilities are genuine, reproducible, and actionable. However, manually creating […]

Learning to Navigate Socially Through Proactive Risk Perception

arXiv:2510.07871v2 Announce Type: replace-cross Abstract: In this report, we describe the technical details of our submission to the IROS 2025 RoboSense Challenge Social Navigation Track. This track focuses on developing RGBD-based perception and navigation systems that enable autonomous agents to navigate safely, efficiently, and socially compliantly in dynamic human-populated indoor environments. The challenge requires agents […]

MeAJOR Corpus: A Multi-Source Dataset for Phishing Email Detection

arXiv:2507.17978v2 Announce Type: replace-cross Abstract: Phishing emails continue to pose a significant threat to cybersecurity by exploiting human vulnerabilities through deceptive content and malicious payloads. While Machine Learning (ML) models are effective at detecting phishing threats, their performance largely relies on the quality and diversity of the training data. This paper presents MeAJOR (Merged email […]

Multimodal Cancer Modeling in the Age of Foundation Model Embeddings

arXiv:2505.07683v3 Announce Type: replace-cross Abstract: The Cancer Genome Atlas (TCGA) has enabled novel discoveries and served as a large-scale reference dataset in cancer through its harmonized genomics, clinical, and imaging data. Numerous prior studies have developed bespoke deep learning models over TCGA for tasks such as cancer survival prediction. A modern paradigm in biomedical deep […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844