The INI has a new website!

This is a legacy webpage. Please visit the new site to ensure you are seeing up to date information.

An Isaac Newton Institute Workshop

Recent Advances in Statistical Genetics and Bioinformatics

Modelling the Boundaries of Highly Conserved Non-Coding DNA Sequences in Vertebrates

Authors: Klaudia Walter (MRC Biostatistics Unit, University of Cambridge), Wally Gilks (University of Leeds), Lorenz Wernisch (Birkbeck College, University of London)


A comparison of the human and fish genomes produced more than 1000 highly conserved non-coding elements (CNEs), sequences of DNA that show a remarkable degree of similarity between human and fish despite an evolutionary distance of about 900 million years.

The high sequence conservation suggests that these CNEs possess some kind of function, though neither their function nor which part of their sequence is functional have been well defined yet. Since each CNE was defined by a pairwise sequence alignment, its boundary might not be accurate enough to design biological experiments to help identify its role in the genome. In a first step an examination of the CNE's nucleotide composition revealed a striking A+T pattern at the CNE boundary in fish as well as human.

In a step further we propose a probabilistic model that takes into consideration not only nucleotide composition but also phylogenetic information, and that aims to define the functional part of CNEs by using multiple sequence alignments of human, mouse, chicken, frog and fish.