[Paper Review] Bayesian Computation and Model Selection in Population Genetics
This paper introduces ABC-GLM, a novel Bayesian computation method that reformulates Approximate Bayesian Computation (ABC) using Generalized Linear Models (GLM) to enable reliable model selection via Bayes factors. By modeling the relationship between summary statistics and parameters through GLM, the approach ensures consistency with prior distributions and allows robust inference even when likelihoods are intractable, successfully detecting population substructure in western chimpanzees with a Bayes factor >10^5.
Until recently, the use of Bayesian inference in population genetics was limited to a few cases because for many realistic population genetic models the likelihood function cannot be calculated analytically . The situation changed with the advent of likelihood-free inference algorithms, often subsumed under the term Approximate Bayesian Computation (ABC). A key innovation was the use of a post-sampling regression adjustment, allowing larger tolerance values and as such shifting computation time to realistic orders of magnitude (see Beaumont et al., 2002). Here we propose a reformulation of the regression adjustment in terms of a General Linear Model (GLM). This allows the integration into the framework of Bayesian statistics and the use of its methods, including model selection via Bayes factors. We then apply the proposed methodology to the question of population subdivision among western chimpanzees Pan troglodytes verus.
Motivation & Objective
- To address the challenge of Bayesian inference in population genetics where likelihood functions are analytically intractable.
- To develop a method that enables reliable model selection using Bayes factors in likelihood-free settings.
- To reformulate regression-adjusted ABC using GLM for improved theoretical consistency and computational robustness.
- To apply the method to test competing population structure models in western chimpanzees (Pan troglodytes verus).
- To demonstrate that ABC-GLM provides stable and accurate posterior approximations and model selection across varying tolerance levels.
Proposed method
- Reformulate the regression adjustment in ABC using a Generalized Linear Model (GLM) to model the relationship between simulated summary statistics and model parameters.
- Use the GLM to estimate the likelihood function of the truncated model, ensuring consistency with the prior distribution.
- Apply the resulting posterior approximation to compute Bayes factors for model comparison.
- Perform stochastic simulations using SIMCOAL2 to generate data under different population models (island vs. panmictic).
- Use Arlequin3.0 to compute summary statistics: average number of alleles per locus (K) and FIS fixation index.
- Evaluate model performance across varying acceptance rates (Aε) to assess stability of Bayes factors.
Experimental results
Research questions
- RQ1Can a GLM-based regression adjustment in ABC provide a consistent and theoretically sound approximation to the posterior distribution in models with intractable likelihoods?
- RQ2Does the ABC-GLM approach enable reliable model selection via Bayes factors in population genetic inference?
- RQ3Is the Bayes factor for population structure models stable across different tolerance levels in ABC?
- RQ4Can ABC-GLM detect population substructure in western chimpanzees when standard models assume panmixia?
- RQ5To what extent do summary statistics like K and FIS reflect underlying population structure rather than inbreeding alone?
Key findings
- The ABC-GLM method produces stable Bayes factors across a wide range of tolerance values, with minimal fluctuations for acceptance rates ≥0.005.
- The Bayes factor strongly favors the island model over the panmictic model, with B ≈ e^12 > 10^5, indicating decisive evidence for population substructure.
- The observed FIS value of 2.6% in western chimpanzees is unlikely under a panmictic model but easily explained under an island model.
- The method enables reliable model selection even with limited simulations, provided tolerance levels are sufficiently large to ensure stable parameter estimation.
- The ABC-GLM framework is compatible with any ABC sampler and supports advanced modeling, including non-linear and heteroscedastic relationships via extended GLM forms.
- The approach successfully integrates into standard Bayesian inference, allowing use of Bayes factors and model averaging.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.