The INI has a new website!

This is a legacy webpage. Please visit the new site to ensure you are seeing up to date information.

An Isaac Newton Institute Workshop

Recent Advances in Statistical Genetics and Bioinformatics

EXPLORING THE ROLE OF NONCODING DNA IN THE FUNCTION OF THE HUMAN GENOME THROUGH VARIATION.

Authors: Christine P. Bird (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK), Webb Miller (Dept. of Computer Science & Engineering, Penn State University, PA, USA), Maureen Liu (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK), Daryl Thomas (Center for Biomolecular Science and Engineering, University of California Santa Cruz, CA, USA), Barbara E. Stranger (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK), Matthew Hurles (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK), Emmanouil T. Dermitzakis (The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK)

Abstract

The function of conserved non-coding (CNCs) DNA has been speculated about since its discovery. We have begun to investigate this by using variation data to study the effect of genomic location upon these sequences. We have used the phase II HapMap consortium SNP data to first investigate the signature of selective constraint of non-coding regions. Our results show that new (derived) alleles of SNPs within CNCs are rarer than new alleles in nonconserved regions (P = 3 x10-18), indicating that evolutionary pressure has suppressed CNC-derived allele frequencies. We have used whole genome alignments of the human, chimp and macaque genomes to identify 1356 non-coding sequences, conserved across multiple mammals, which show significantly accelerated substitution rate in the human lineage, indicated by a relative rate test in the human-chimp-macaque alignments. We subsequently test which of these 1356 sequences are a result of relaxation of selective constraint versus positive selection. Detectable segmental duplications are by their nature primate-specific events. An intriguing question is whether these rapidly evolving CNCs are enriched within segmental duplications? The accelerated CNCs could be due to a loss of selective constraint or positive selection, and either of these scenarios could relate to differential gene expression patterns between their associated paralogous genes. We have identified an enrichment of accelerated CNCs in the most recently formed segmental duplications. We are currently investigating the potential for reciprocal changes in duplicated CNCs. We have also recently identified a group of accelerated CNCs that contain SNPs that are identified as significant contributors to gene expression variation. We will present our current computational and functional analysis on the evolutionary properties of CNCs within and between species and the functional consequences on gene expression.