![]() ![]() |
|
David Haussler's genome bioinformatics team has been a major contributor to the public consortium efforts to produce, assemble, and annotate the first mammalian genomes. As collaborators in the Human Genome Project, they built the program that assembled the first working draft of the human genome sequence from information produced by sequencing centers worldwide and have participated in the informatics associated with the finishing effort. The group also maintains UCSC's interactive Genome Browser for the human, mouse, rat, and other genomes, which is used by thousands of biomedical researchers from around the world every day. By integrating multiple sets of high-throughput genomics data, computational predictions, and curated genomic feature sets from dozens of laboratories, this browser provides a new kind of computational microscope for exploring genomes. Work developing and annotating genomes for the browser has provided both a foundation and a Web-based forum for scientific efforts of the Haussler research group. These are directed at the large-scale discovery and characterization of the functional elements in mammalian genomes through comparative sequence analysis, the study of mammalian molecular evolution, and the integration of an increasing variety of high-throughput data sets provided by functional genomics efforts. An important recent finding made by Haussler and his colleagues involves the surprising discovery of "ultra-conserved" regions of various mammalian genomes. Ultra-conserved Regions in Mammalian Genomes Throughout the approximately 75 million years since the human species diverged from its common ancestor with the rat and mouse, these three genomes have independently accumulated many changes, leading to the three different species we see today. Reconstructing these changes through computational analysis has provided a new understanding of how the human genome contributes to who we are. By comparing the human, mouse, and rat genomes, it has become clear that the rate of neutral substitution varies regionally along the chromosomes.
Haussler's and his colleagues have determined that a core of about 40 percent of the human, rat, and mouse genome sequences derives from a common ancestor, and they have produced base-level alignments between the three genomes in these regions. This alignment, combined with characterization of neutral substitution rates, has led to an estimate that approximately 5 to 6 percent of the human genome is under purifying selection. Haussler believes that these conserved regions contain the most functionally important elements of the genome and point to areas where intensified study will lead to a better understanding of how the genome works. Interestingly, most of the well-conserved elements identified lie outside of known genes. Since only 1.5 percent of the genome is coding, if this rough estimate holds up, it would imply that there is a substantial amount of noncoding DNA in the remaining 3.5 percent that is functional. Haussler is working with an international team of researchers to characterize this functional 5 to 6 percent over the next several years. Recent examination of the remaining non-coding 3.5 percent of the human genome that is under purifying selection has led to the identification of 481 "ultra-conserved" regions of 200 or more DNA bases that are completely identical in the human, mouse, and rat genomes. The probability of finding even one such element in the 2.9 billion bases of the human genome is almost nonexistent under a standard model of neutral evolution, where every base is equally likely to undergo independent change. Nearly all of these unchanged regions were also found in the dog and chicken genomes, and two-thirds of them were found in the fish genome, although they cannot be traced beyond the fish to sea squirt, fly, or worm. These 481 ultra-conserved regions most often either overlap genes that are involved in RNA processing or reside in the non-coding portions of genes or near genes that are involved in regulating gene transcription or development. Click here for more information regarding these ultra-conserved regions. In an attempt to build realistic and information-rich mathematical models of molecular evolution, Haussler's group has undertaken larger, multi-species comparisons in conjunction with the NIH Intramural Sequencing Center. Some of these models are tailored to specific kinds of functional elements, such as coding exons (in conjunction with the NIH Mammalian Gene Collection Project) and transcription factor binding sites (in conjunction with the NHGRI ENCODE project). These models should identify elements under purifying selection with higher sensitivity and specificity than has been possible with two-species comparisons. Ultimately, Haussler hopes to explore the full spectrum of events in mammalian molecular evolution, including insertions, deletions, duplications, inversions, and rearrangements. As the number of sequenced genomes grows, their goal is to produce increasingly accurate analyses of the evolutionary history of each base in the human genome as a basis for genome-wide functional analysis. The Howard Hughes Medical Institute, the National Human Genome Research Institute, the National Cancer Institute, the National Science Foundation, and the California Institute for Quantitative Biomedical Research (QB3) provide funding for Prof. Haussler's research. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||