Depression subtype classification from social media posts: few-shot prompting vs. fine-tuning of large language models

BackgroundSocial media provides timely proxy signals of mental health, but reliable tweet-level classification of depression subtypes remains challenging due to short, noisy text, overlapping symptomatology,

Editorial: Advancing digital mental health for youth

Post Content

When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Models

arXiv:2603.26556v1 Announce Type: cross Abstract: Converting a pretrained Transformer into a more efficient hybrid model through distillation offers a promising approach to reducing inference costs.

Humanline: Online Alignment as Perceptual Loss

arXiv:2509.24207v2 Announce Type: replace Abstract: Online alignment (e.g., GRPO) is generally more performant than offline alignment (e.g., DPO) — but why? Drawing on prospect theory

QHap: Quantum-Inspired Haplotype Phasing

arXiv:2603.25762v1 Announce Type: new Abstract: Haplotype phasing, the process of resolving parental allele inheritance patterns in diploid genomes, is critical for precision medicine and population

Machine learning model leveraging SMILES-derived NMR spectroscopy data to predict dopamine D1 receptor antagonists: a prospective framework for forecasting the impact of engineered nanoparticles on the functionalities of small biomolecules

March 23, 2026

arXiv:2501.14044v4 Announce Type: replace
Abstract: The article proposes a conceptual approach for evaluating the impact of engineered nanoparticles (NPs) on the functionality of small biomolecules. The developed machine learning (ML) model is based on in-silico 13C NMR spectroscopy chemical shifts derived by the SMILES notations on small biomolecules. The rationale behind this approach is that 13C NMR provide information about the atom environment of the carbon atoms. Thus, decomposing the small biomolecules into their fundamental 13C NMR spectral data, and performing classification based on the count and position of chemical peaks, establishes a baseline for evaluating the impact of NPs on the functionality of small biomolecules, even if the ML model is not based on nano data. The approach mitigates not only the scarcity of nano-bio data but also hold potential for building of NP`s portfolio by utilising data collected from various in vitro, in situ, in vivo, and organ-on-a-chip environments across multiple timeframes. Such a framework enables predictive modeling based on these multi-environmental datasets, facilitating a deeper understanding of NP behaviour. The methodology was demonstrated using data from bioassay focused on human dopamine D1 receptor antagonists provided by PubChem. The model was train with 26,766 samples and test on 5,466 samples, achieving Accuracy of 70.8%, Precision of 74.3%, recall of 63.6%, F1-score of 68.5% and ROC of 70.8% were achieved by the Support Vector classifier, with an Area Under the Curve (AUC) of 76% and Matthews Correlation Coefficient, MCC=0.4204. A secondary, non-NP-related ML model was developed to complement the study case. It uses PubChem compound and substance identifiers (CIDs and SIDs) to predict whether pre-designed small biomolecules have the potential to be human dopamine D1 receptor antagonists.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844