The INI has a new website!

This is a legacy webpage. Please visit the new site to ensure you are seeing up to date information.

Isaac Newton Institute for Mathematical Sciences

Mathematical and Statistical Aspects of Molecular Biology

Bayesian logistic regression using a perfect phylogeny

Authors: Taane G Clark (Department of Statistics, University of Oxford), Maria De Iorio (Department of Mathematics, Imperial College), Robert C Griffiths (Department of Statistics, University of Oxford)

Abstract

Haplotype data capture the genetic variation among individuals in a population and among populations. An understanding of this variation and the ancestral history of haplotypes is essential in genetic association studies of complex disease. We introduce a method for detecting associations between disease and haplotypes in a candidate gene region or candidate block with little or no recombination. In this setting, a perfect phylogeny constraint or the equivalent gene tree representation demonstrates the evolutionary relationship between single-nucleotide polymorphisms (SNPs) in the haplotypes. Our approach extends the logic regression approach (Ruczinski et al, 2003} to a Bayesian framework, and constrains the model space to that of a gene tree or perfect phylogeny. The gene tree hypothesis imposes constraints on possible solutions to the association problem. Because the method is within a regression framework, environmental factors and SNP-environment interactions can be incorporated, and tools for model diagnostics can be implemented. We demonstrate our method on simulated data from a coalescent model, as well as data from a candidate gene study of smoking persistence.