Human Medical Genomics
The completion of the Human Genome Sequencing Project was promised to start a revolution in human medicine. Researchers at the Foundation recognize that this revolution will come, but only if answers to "Why?" questions are placed upon the molecular structures that the human genome project generated.
These answers will come through a historical analysis of genomes. Similarities and differences between genomes from different vertebrates identify where and how molecular structure has changed to support new function, or to conserve ancient function in a changing environment.
Evolutionary analyses help address some of the most important questions when developing diagnostics and therapeutics. What proteins should be targeted? What animal models should be used to develop a human medicine? What are the likely side effects of a drug? How will different patients react differently to a treatment?
The human body uses many enzymes to excrete foreign compounds, both toxins and pharmaceuticals. Scientists working at the Foundation have used an evolutionary analysis of one class of these, sulfotransferases, to better understand the differences between humans and our nearest primate relatives.
Phylogenomic approaches to common problems encountered in the analysis of low copy repeats: The sulfotransferase IA gene family example
Bradley, ME; Benner, SA
BMC Evol. Biol.
5 22 (2005)
Background: Blocks of duplicated genomic DNA sequence longer than 1000 base pairs are known as low copy repeats (LCRs). Identified by their sequence similarity, LCRs are abundant in the human genome, and are interesting because they may represent recent adaptive events, or potential future adaptive opportunities within the human lineage. Sequence analysis tools are needed, however, to decide whether these interpretations are likely, whether a particular set of LCRs represents nearly neutral drift creating junk DNA, or whether the appearance of LCRs reflects assembly error. Here we investigate an LCR family containing the sulfotransferase (SULT) IA genes involved in drug metabolism, cancer, hormone regulation, and neurotransmitter biology as a first step for defining the problems that those tools must manage. Results: Sequence analysis here identified a fourth sulfotransferase gene, which may be transcriptionally active, located on human chromosome 16. Four regions of genomic sequence containing the four human SULTIA paralogs defined a new LCR family. The stem hominoid SULTIA progenitor locus was identified by comparative genomics involving complete human and rodent genomes, and a draft chimpanzee genome. SULTIA expansion in hominoid genomes was followed by positive selection acting on specific protein sites. This episode of adaptive evolution appears to be responsible for the dopamine sulfonation function of some SULT enzymes. Each of the conclusions that this bioinformatic analysis generated using data that has uncertain reliability (such as that from the chimpanzee genome sequencing project) has been confirmed experimentally or by a "finished" chromosome 16 assembly, both of which were published after the submission of this manuscript. Conclusion: SULTIA genes expanded from one to four copies in hominoids during intra-chromosomal LCR duplications, including (apparently) one after the divergence of chimpanzees and humans. Thus, LCRs may provide a means for amplifying genes (and other genetic elements) that are adaptively useful. Being located on and among LCRs, however, could make the human SULTIA genes susceptible to further duplications or deletions resulting in 'genomic diseases' for some individuals. Pharmacogenomic studies of SULTIAsingle nucleotide polymorphisms, therefore, should also consider examining SULTIA copy number variability when searching for genotype-phenotype associations. The latest duplication is, however, only a substantiated hypothesis; an alternative explanation, disfavored by the majority of evidence, is that the duplication is an artifact of incorrect genome assembly.
Leptin and obesity
Obesity and its consequent diseases, including diabetes, hypertension, and cardiovascular disease, is today an epidemic. The Foundation is working to understand some of the molecules that influence obesity, and how animal models should be be designed to help develop treatments to manage the disease.
Natural history of prostate cancer
Fossils suggest that the prostate arose in mammals ca. 120 million years ago. A set of genes arose in our ancestors at approximately the same time. Scientists at the Foundation are working to exploit this connection to improve the management of prostate cancer.
Cystic fibrosis is the tragic consequence of mutation in the human germ line, and similar mutations create a variety of human disease. The Foundation has used an evolutionary analysis of these mutations to better understand deleterious polymorphisms in the human population.
Application of DETECTER, an Evolutionary Genomic Tool to Analyze Genetic Variation, to the Cystic Fibrosis Gene Family
Gaucher, EA; DeKee, DW; Benner, SA
7 44 (2006)
Background: The medical community requires computational tools that distinguish genetic differences having phenotypic impact within the vast number of mutations that do not. Tools that do this will become increasingly important for those seeking to use human genome sequence data to predict disease, make prognoses, and customize therapy to individual patients.
Results: An approach, termed DETECTER, is proposed to identify sites in a protein sequence where amino acid replacements are likely to have a significant effect on phenotype, including causing genetic disease. This approach uses a model-dependent tool to estimate the normalized replacement rate at individual sites in a protein sequence, based on a history of those sites extracted from an evolutionary analysis of the corresponding protein family. This tool identifies sites that have higher-than-average, average, or lower- than-average rates of change in the lineage leading to the sequence in the population of interest. The rates are then combined with sequence data to determine the likelihoods that particular amino acids were present at individual sites in the evolutionary history of the gene family. These likelihoods are used to predict whether any specific amino acid replacements, if introduced at the site in a modern human population, would have a significant impact on fitness. The DETECTER tool is used to analyze the cystic fibrosis transmembrane conductance regulator (CFTR) gene family.
Conclusions: In this system, DETECTER retrodicts amino acid replacements associated with the cystic fibrosis disease with greater accuracy than alternative approaches. While this result validates this approach for this particular family of proteins only, the approach may be applicable to the analysis of polymorphisms generally, including SNPs in a human population.