Convergence-divergence models: Generalizations of phylogenetic trees modeling gene flow over time

arXiv:2504.07384v2 Announce Type: replace
Abstract: Phylogenetic trees are simple models of evolutionary processes. They describe conditionally independent divergent evolution from common ancestors. However, they often lack the flexibility to represent processes like introgressive hybridization, which leads to gene flow between taxa. Phylogenetic networks generalize trees but typically assume that ancestral taxa merge instantaneously to form “hybrid” descendants. In contrast, convergence-divergence models retain a single underlying “principal tree” and permit gene flow over arbitrary time frames. They can also model other biological processes leading to taxa becoming more similar, such as replicated evolution. We present novel maximum likelihood algorithms to infer most aspects of $N$-taxon convergence-divergence models – many consistently – using a quartet-based approach. All algorithms use $4$-taxon convergence-divergence models, inferred from subsets of the $N$ taxa using a model selection criterion. The first algorithm infers an $N$-taxon principal tree; the second infers sets of converging taxa; and the third infers model parameters – root probabilities, edge lengths and convergence parameters. The algorithms can be applied to multiple sequence alignments restricted to genes or genomic windows or to gene presence/absence datasets. We demonstrate that convergence-divergence models can be accurately recovered from simulated data.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registeration number 16808844