arXiv:2505.07140v3 Announce Type: replace
Abstract: In chemical safety assessment, validation studies rely on reference compound lists to evaluate the applicability of alternative methods prior to regulatory acceptance. These lists are expected to cover multiple aspects, including chemical structure, physicochemical properties, and toxicity profiles. In practice, however, trade-offs among these aspects are typically addressed implicitly through expert judgment, making them difficult to examine systematically. Here, we formulate reference compound selection for toxicity assay validation as an explicit multi-objective design problem. We define three interpretable objectives capturing structural, physicochemical, and toxicity diversity, and employ a genetic algorithm as an exploratory tool to examine the trade-off structure and resulting Pareto-optimal solutions. Rather than prescribing optimal or recommended compound sets, this formulation enables systematic exploration of designs and explicit comparison of their positions within a common design space. As an illustrative application, we link existing assay datasets with expert-curated validation lists by treating “selected as a reference compound“ as an annotation on the underlying compound pool. We show that expert-selected, random, and algorithmically generated compound lists occupy distinct regions of the design space. Furthermore, under an illustrative fixed modeling setup, different regions of the design space were associated with different observed evaluation outcomes, supporting the view that reference compound selection constitutes a structured dimension of evaluation design. Together, these results provide a methodological perspective for treating reference compound selection as an analyzable design object, complementing established expert-driven practices.
Depression subtype classification from social media posts: few-shot prompting vs. fine-tuning of large language models
BackgroundSocial media provides timely proxy signals of mental health, but reliable tweet-level classification of depression subtypes remains challenging due to short, noisy text, overlapping symptomatology,



