Reasoning Planning for Language Models

arXiv:2511.00521v1 Announce Type: cross Abstract: Selecting an appropriate reasoning method for a given query remains a key challenge in language model generation. Existing approaches typically generate multiple candidate responses and use an aggregation strategy to select the output answer, often assuming that more candidate answers yield higher accuracy. We revisit this assumption through a rigorous […]

Recognising, Anticipating, and Mitigating LLM Pollution of Online Behavioural Research

arXiv:2508.01390v2 Announce Type: replace-cross Abstract: Online behavioural research faces an emerging threat as participants increasingly turn to large language models (LLMs) for advice, translation, or task delegation: LLM Pollution. We identify three interacting variants through which LLM Pollution threatens the validity and integrity of online behavioural research. First, Partial LLM Mediation occurs when participants make […]

More Than A Shortcut: A Hyperbolic Approach To Early-Exit Networks

arXiv:2511.00641v1 Announce Type: cross Abstract: Deploying accurate event detection on resource-constrained devices is challenged by the trade-off between performance and computational cost. While Early-Exit (EE) networks offer a solution through adaptive computation, they often fail to enforce a coherent hierarchical structure, limiting the reliability of their early predictions. To address this, we propose Hyperbolic Early-Exit […]

Engineering.ai: A Platform for Teams of AI Engineers in Computational Design

arXiv:2511.00122v1 Announce Type: new Abstract: In modern engineering practice, human engineers collaborate in specialized teams to design complex products, with each expert completing their respective tasks while communicating and exchanging results and data with one another. While this division of expertise is essential for managing multidisciplinary complexity, it demands substantial development time and cost. Recently, […]

FedOnco-Bench: A Reproducible Benchmark for Privacy-Aware Federated Tumor Segmentation with Synthetic CT Data

arXiv:2511.00795v1 Announce Type: cross Abstract: Federated Learning (FL) allows multiple institutions to cooperatively train machine learning models while retaining sensitive data at the source, which has great utility in privacy-sensitive environments. However, FL systems remain vulnerable to membership-inference attacks and data heterogeneity. This paper presents FedOnco-Bench, a reproducible benchmark for privacy-aware FL using synthetic oncologic […]

Debiasing LLMs by Masking Unfairness-Driving Attention Heads

arXiv:2510.10142v3 Announce Type: replace-cross Abstract: Large language models (LLMs) increasingly mediate decisions in domains where unfair treatment of demographic groups is unacceptable. Existing work probes when biased outputs appear, but gives little insight into the mechanisms that generate them, leaving existing mitigations largely fragile. In this paper, we conduct a systematic investigation LLM unfairness and […]

KFCPO: Kronecker-Factored Approximated Constrained Policy Optimization

arXiv:2511.00880v1 Announce Type: cross Abstract: We propose KFCPO, a novel Safe Reinforcement Learning (Safe RL) algorithm that combines scalable Kronecker-Factored Approximate Curvature (K-FAC) based second-order policy optimization with safety-aware gradient manipulation. KFCPO leverages K-FAC to perform efficient and stable natural gradient updates by approximating the Fisher Information Matrix (FIM) in a layerwise, closed form manner, […]

ARC-GEN: A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus

arXiv:2511.00162v1 Announce Type: new Abstract: The Abstraction and Reasoning Corpus remains one of the most compelling and challenging benchmarks for tracking progress toward achieving Artificial General Intelligence. In contrast to other evaluation datasets designed to assess an agent’s task-specific skills or accumulated knowledge, the ARC-AGI suite is specifically targeted at measuring skill acquisition efficiency, a […]

HAFixAgent: History-Aware Automated Program Repair Agent

arXiv:2511.01047v1 Announce Type: cross Abstract: Automated program repair (APR) has recently shifted toward large language models and agent-based systems, yet most systems rely on local snapshot context, overlooking repository history. Prior work shows that repository history helps repair single-line bugs, since the last commit touching the buggy line is often the bug-introducing one. In this […]

Federated Cyber Defense: Privacy-Preserving Ransomware Detection Across Distributed Systems

arXiv:2511.01583v1 Announce Type: cross Abstract: Detecting malware, especially ransomware, is essential to securing today’s interconnected ecosystems, including cloud storage, enterprise file-sharing, and database services. Training high-performing artificial intelligence (AI) detectors requires diverse datasets, which are often distributed across multiple organizations, making centralization necessary. However, centralized learning is often impractical due to security, privacy regulations, data […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844