• Home
  • Uncategorized
  • SEEK-VEC: Robust Latent Structure Discovery via Ensemble Topic Modeling

SEEK-VEC: Robust Latent Structure Discovery via Ensemble Topic Modeling

Count data are ubiquitous across many applications in which understanding hidden patterns, or latent structure, is of interest. Topic modeling is a powerful tool for detecting latent structure in count data. However, standard topic modeling methods are often constrained by their restrictive assumptions, susceptible to noise, and sensitive to misspecification of the number of topics, which is particularly of concern when analyzing non-text data. Here, we introduce SEEK-VEC (Spectral Ensembling of topic models with Eigenscore for K-agnostic Vocabulary Embedding and Classification), an ensemble framework for count data that integrates insights from multiple candidate topic models through a spectral ensembling procedure. This approach automatically reinforces signal and mitigates noise to generate a consensus low-dimensional embedding of the data. SEEK-VEC produces prioritization scores and grouping scores that enable variable classification, interactive pattern discovery, and model diagnostics. Through simulations, we demonstrate that SEEK-VEC is robust under realistic settings and outperforms state-of-the-art oracle methods, particularly when signal strength is weak. Applied to diverse real-world datasets, including self-reported psychopathology symptom data, food preference questionnaires, and single-cell transcriptomics, SEEK-VEC reveals latent structures that provide scientifically meaningful insights.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844