Training for Technology: Adoption and Productive Use of Generative AI in Legal Analysis

arXiv:2603.04982v3 Announce Type: replace-cross Abstract: Can targeted user training unlock the productive potential of generative artificial intelligence in professional settings? We study this question using a randomized experiment in which 164 law students completed an issue-spotting examination under one of three conditions: no GenAI access, optional access to a large language model (LLM), or LLM […]

Attack Selection in Agentic AI Control Evaluations Meaningfully Decreases Safety

arXiv:2606.06529v1 Announce Type: new Abstract: An attacker that strategically chooses when to attack is much harder to catch than one that attacks indiscriminately. AI control is a safety framework for deploying capable but untrusted AI agents under the oversight of a weaker, trusted monitor and a limited human audit budget. Control evaluations stress-test these protocols […]

Automatic Causal Fairness Analysis with LLM-Generated Reporting

arXiv:2604.27011v2 Announce Type: replace-cross Abstract: AutoML, intended as the process of automating the application of machine learning to real-world problems, is a key step for AI popularisation. Most AutoML frameworks are not accounting for the potential lack of fairness in the training data and in the corresponding predictions. We introduce textscFairMind, a software prototype aiming […]

Lean4Agent: Formal Modeling and Verification for Agent Workflow and Trajectory

arXiv:2606.06523v1 Announce Type: new Abstract: Equipping Large Language Models (LLMs) to execute reliable multi-step workflows has become a central challenge in artificial intelligence. Despite recent advances in LLMs’ agentic capabilities, most agent systems still lack formal methods for specifying, verifying, and debugging their workflow and execution trajectories. This challenge mirrors a long-standing problem in mathematics, […]

Evolving Features vs Evolving Entire Trees with GP for Interpretable Survival Analysis

arXiv:2605.30119v2 Announce Type: replace-cross Abstract: Survival analysis concerns the task of predicting the time until an event occurs. Often used in the medical field, survival analysis deals with incomplete (i.e., censored) data, for instance, from patients who did not experience the event during the duration of the study. For practical use, both accuracy and interpretability […]

SafeGene: Reusable Adapters for Transferable Safety Alignment

arXiv:2606.06519v1 Announce Type: new Abstract: Open-weight LLMs are increasingly fine-tuned into customized assistants, but downstream fine-tuning can weaken safety alignment and make models more vulnerable to malicious prompts, even when the training data is not intentionally harmful. This creates a recurring safety recovery problem as target models are repeatedly updated with new task data or […]

Reinforcement Learning from Rich Feedback with Distributional DAgger

arXiv:2606.05152v2 Announce Type: replace-cross Abstract: Reasoning models have advanced rapidly, but the dominant reinforcement learning from verifiable rewards (RLVR) recipe remains surprisingly narrow: sample many responses and reward each with a single bit indicating whether the final answer is correct. Yet many settings provide rich feedback, including execution traces, tool outputs, expert corrections, and model […]

DiBS: Diffusion-Informed Branch Selection

arXiv:2606.06518v1 Announce Type: new Abstract: Sudoku is a representative constraint satisfaction problem that requires global structural reasoning under strict discrete constraints. The existing works of solving Sudoku mainly focus on two dominant approaches, i.e., traditional heuristic and deep learning solver. However, they suffer from two complementary limitations: learning-based solvers lack hard correctness guarantees, while complete […]

Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

arXiv:2606.07412v1 Announce Type: cross Abstract: LLM-driven software engineering agents have become a central testbed for real-world language-model capability, yet their training remains limited by the availability of high-quality SWE tasks. Existing synthetic data methods typically create tasks through fixed mutation or bug-injection procedures, making the resulting distributions largely independent of the agent’s own weaknesses and […]

ScenicRules: An Autonomous Driving Benchmark with Multi-Objective Specifications and Abstract Scenarios

arXiv:2602.16073v2 Announce Type: replace-cross Abstract: Developing autonomous driving systems for complex traffic environments requires balancing multiple objectives, such as avoiding collisions, obeying traffic rules, and making efficient progress. In many situations, these objectives cannot be satisfied simultaneously, and explicit priority relations naturally arise. Also, driving rules require context, so it is important to formally model […]

Whisper Hallucination Detection and Mitigation via Hidden Representation Steering and Sparse AutoEncoders

arXiv:2606.07473v1 Announce Type: cross Abstract: Whisper, a widely adopted ASR model, is known to suffer from hallucinations – coherent transcriptions generated for non-speech audio entirely disconnected from the input. We investigate whether hallucinations can be detected and mitigated through Whisper’s internal representations. We extract audio encoder activations and evaluate two representation spaces: raw Whisper activations […]

DSU-Net: An Attention-Enhanced Dense Skip U-Net for Breast Lesion Segmentation in Mammographic Images

arXiv:2606.06537v1 Announce Type: new Abstract: Breast cancer remains one of the leading causes of cancer-related mortality among women worldwide, making early detection essential for effective treatment. Mammography is the primary screening modality; however, accurate delineation of suspicious lesions remains challenging and subject to inter-observer variability. Automated segmentation methods can assist radiologists by providing consistent and […]

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844