Count data are ubiquitous across many applications in which understanding hidden patterns, or latent structure, is of interest. Topic modeling is a powerful tool for detecting latent structure in count data. However, standard topic modeling methods are often constrained by their restrictive assumptions, susceptible to noise, and sensitive to misspecification of the number of topics, which is particularly of concern when analyzing non-text data. Here, we introduce SEEK-VEC (Spectral Ensembling of topic models with Eigenscore for K-agnostic Vocabulary Embedding and Classification), an ensemble framework for count data that integrates insights from multiple candidate topic models through a spectral ensembling procedure. This approach automatically reinforces signal and mitigates noise to generate a consensus low-dimensional embedding of the data. SEEK-VEC produces prioritization scores and grouping scores that enable variable classification, interactive pattern discovery, and model diagnostics. Through simulations, we demonstrate that SEEK-VEC is robust under realistic settings and outperforms state-of-the-art oracle methods, particularly when signal strength is weak. Applied to diverse real-world datasets, including self-reported psychopathology symptom data, food preference questionnaires, and single-cell transcriptomics, SEEK-VEC reveals latent structures that provide scientifically meaningful insights.
China figured out how to sell EVs. Now it has to bury their batteries.
In August 2025, Wang Lei decided it was finally time to say goodbye to his electric vehicle. Wang, who is 39, had bought the car

