Three immunoregulatory signatures define non-productive HIV infection in CD4+ T memory stem cells

The persistent HIV reservoir constitutes the main obstacle to curing HIV/AIDS disease. Our understanding of how non-productive HIV infections are established in primary human CD4+

Dispersal, adaptation and persistence of H5N1 in the sub-Antarctic and Antarctica

High pathogenicity avian influenza virus (HPAIV) H5N1 reached the sub-Antarctic and Antarctica in 2023, subsequently spreading to remote locations within this region where it had

ApeA cleaves genomic RNA to defend against RNA phage infection

To protect themselves against bacteriophage infection, bacteria encode a vast diversity of antiphage defense systems. However, the mechanisms of action of most of these systems

FASTERCC: Accelerating Flux Consistency Testing and Context-Specific Reconstruction for Large-Scale Metabolic Network Models

The increase in size of metabolic network models especially with the advent of single-cell data calls for scalable reconstruction and analysis tools. Such models, often

Acceleration and Velocity Dissociate Temporal Phases of Postural Control in Rhesus Macaques

Maintaining balance requires the nervous system to transform sensory signals about unexpected postural perturbations into precisely timed motor commands. Although human studies have established that

Detecting Data Poisoning in Code Generation LLMs via Black-Box, Vulnerability-Oriented Scanning

March 19, 2026

arXiv:2603.17174v1 Announce Type: cross
Abstract: Code generation large language models (LLMs) are increasingly integrated into modern software development workflows. Recent work has shown that these models are vulnerable to backdoor and poisoning attacks that induce the generation of insecure code, yet effective defenses remain limited. Existing scanning approaches rely on token-level generation consistency to invert attack targets, which is ineffective for source code where identical semantics can appear in diverse syntactic forms. We present CodeScan, which, to the best of our knowledge, is the first poisoning-scanning framework tailored to code generation models. CodeScan identifies attack targets by analyzing structural similarities across multiple generations conditioned on different clean prompts. It combines iterative divergence analysis with abstract syntax tree (AST)-based normalization to abstract away surface-level variation and unify semantically equivalent code, isolating structures that recur consistently across generations. CodeScan then applies LLM-based vulnerability analysis to determine whether the extracted structures contain security vulnerabilities and flags the model as compromised when such a structure is found. We evaluate CodeScan against four representative attacks under both backdoor and poisoning settings across three real-world vulnerability classes. Experiments on 108 models spanning three architectures and multiple model sizes demonstrate 97%+ detection accuracy with substantially lower false positives than prior methods.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844