Uncovering Code Insights: Leveraging GitHub Artifacts for Deeper Code Understanding

arXiv:2511.03549v1 Announce Type: cross Abstract: Understanding the purpose of source code is a critical task in software maintenance, onboarding, and modernization. While large language models

Visualization Biases MLLM’s Decision Making in Network Data Tasks

arXiv:2511.03617v1 Announce Type: cross Abstract: We evaluate how visualizations can influence the judgment of MLLMs about the presence or absence of bridges in a network.

Benchmarking the Thinking Mode of Multimodal Large Language Models in Clinical Tasks

arXiv:2511.03328v1 Announce Type: cross Abstract: A recent advancement in Multimodal Large Language Models (MLLMs) research is the emergence of “reasoning MLLMs” that offer explicit control

Light over Heavy: Automated Performance Requirements Quantification with Linguistic Inducement

arXiv:2511.03421v1 Announce Type: cross Abstract: Elicited performance requirements need to be quantified for compliance in different engineering tasks, e.g., configuration tuning and performance testing. Much

RefAgent: A Multi-agent LLM-based Framework for Automatic Software Refactoring

arXiv:2511.03153v1 Announce Type: cross Abstract: Large Language Models (LLMs) have substantially influenced various software engineering tasks. Indeed, in the case of software refactoring, traditional LLMs

Revisiting graph-based approaches for small protein analysis: Insights from anti-CRISPR protein networks

October 31, 2025

Bacteriophage anti-CRISPR (Acr) proteins have the potential to reduce off-target effects of genome editing by inactivating the CRISPR-Cas bacterial defense. The current challenge lays in their functional annotation, as Acr proteins have high structural diversity and low sequence similarity, thus rendering common homology-based methods unfit. Recent solutions use deep learning models such as graph convolutional networks that take protein networks as the data input. In an effort to understand whether these new solutions are fit for niche, sparsely annotated proteins, we focus on 3 Acr proteins (AcrIF1, AcrIIA1, and AcrVIA1) as a case study. For each, we create protein contact networks (PCNs) and residue interaction graphs (RIGs) based on existing network theory and methodology. We characterize and analyze these protein networks by comparing how each network architecture affects values of small-worldliness. We reexamine a previous method that focused on using node degree, closeness centralities, and residue solvent accessibility to predict functional residues within a protein via a Jackknife technique. We discuss the implications of the construction of these networks based on how the structure information is acquired. We demonstrate that functional residues within small proteins cannot be reliably predicted with the Jackknife technique, even when provided with a curated dataset containing representative standardized values for degree and closeness centrality. We show that functional residues within these small proteins have low degrees within both PCNs and RIGs, thus making them susceptible to the known degree bias towards high degree nodes present in using graph convolutional networks. We discuss how understanding the data can be used to further improve deep learning approaches for small proteins.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844