Dissociable contributions of cortical thickness and surface area to cognitive ageing: evidence from multiple longitudinal cohorts.

Cortical volume, a widely-used marker of brain ageing, is the product of two genetically and developmentally dissociable morphometric features: thickness and area. However, it remains

Animal collocation revisited: intercohort comparison and a case study comparing call combinations between sexes in common marmosets

Many animals communicate using sequences of signals, but identifying recurrent, non-random signal combinations remains methodologically challenging. Collocation analyses are increasingly popular approaches for detecting which

Helicase: Vectorized parsing and bitpacking of genomic sequences

Modern sequencing pipelines routinely produce billions of reads, yet the dominant storage formats (FASTQ and FASTA) are text-based and sequential, making high-throughput parsing a persistent

Ineffectual Genomic Error Correction Under Environmental Perturbation Dynamically Regulates Mutational Supply and Robustness

Adaptive evolution depends on the supply of heritable variation, yet excessive mutation threatens viability by degrading essential molecular functions. Here, we show that this trade-off

Three immunoregulatory signatures define non-productive HIV infection in CD4+ T memory stem cells

The persistent HIV reservoir constitutes the main obstacle to curing HIV/AIDS disease. Our understanding of how non-productive HIV infections are established in primary human CD4+

Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context

March 10, 2026

arXiv:2603.07792v1 Announce Type: cross
Abstract: Large language models (LLMs) increasingly influence global digital ecosystems, yet their potential to perpetuate social and cultural biases remains poorly understood in underrepresented contexts. This study presents a systematic analysis of representational biases in seven state-of-the-art LLMs: GPT-4o-mini, Claude-3-Sonnet, Claude-4-Sonnet, Gemini-2.0-Flash, Gemini-2.0-Lite, Llama-3-70B, and Mistral-Nemo in the Nepali cultural context. Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1) agreement with biased statements and (2) stereotypical completion tendencies. Results show models exhibit measurable explicit agreement bias, with mean bias agreement ranging from 0.36 to 0.43 across decoding configurations, and an implicit completion bias rate of 0.740-0.755. Importantly, implicit completion bias follows a non-linear, U-shaped relationship with temperature, peaking at moderate stochasticity (T=0.3) and declining slightly at higher temperatures. Correlation analysis under different decoding settings revealed that explicit agreement strongly aligns with stereotypical sentence agreement but is a weak and often negative predictor of implicit completion bias, indicating generative bias is poorly captured by agreement metrics. Sensitivity analysis shows increasing top-p amplifies explicit bias, while implicit generative bias remains largely stable. Domain-level analysis shows implicit bias is strongest for race and sociocultural stereotypes, while explicit agreement bias is similar across gender and sociocultural categories, with race showing the lowest explicit agreement. These findings highlight the need for culturally grounded datasets and debiasing strategies for LLMs in underrepresented societies.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd. dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844