About FfAME Publications

All publications that have been produced by FfAME scientists can be found on this website. PDFs for most of these publications can be obtained by simply clicking on a publication's title. To view an individual scientist's publications, use the menu on the left. Below is a list of recent and notable FfAME publications.

Notable Publications

Catalytic Synthesis of Polyribonucleic Acid on Prebiotic Rock Glasses
Craig A. Jerome, Hyo-Joong Kim, Stephen J. Mojzsis, Steven A. Benner, and Elisa Biondi
Astrobiology (2022) http://doi.org/10.1089/ast.2022.0027
<Abstract>

Reported here are experiments that show that ribonucleoside triphosphates are converted to polyribonucleic acid when incubated with rock glasses similar to those likely present 4.3-4.4 billion years ago on the Hadean Earth surface, where they were formed by impacts and volcanism. This polyribonucleic acid averages 100-300 nucleotides in length, with a substantial fraction of 3',-5'-dinucleotide linkages. Chemical analyses, including classical methods that were used to prove the structure of natural RNA, establish a polyribonucleic acid structure for these products. The polyribonucleic acid accumulated and was stable for months, with a synthesis rate of 2 x 10^-3 pmoles of triphosphate polymerized each hour per gram of glass (25°C, pH 7.5). These results suggest that polyribonucleotides were available to Hadean environments if triphosphates were. As many proposals are emerging describing how triphosphates might have been made on the Hadean Earth, the process observed here offers an important missing step in models for the prebiotic synthesis of RNA.

Agnostic Life Finder (ALF) for Large-Scale Screening of Martian Life During In Situ Refueling
Spacek, J. & Benner, S.A.
Astrobiology (2022) 22, 8, DOI:10.1089/ast.2021.0070
<Abstract>

Before the first humans depart for Mars in the next decade, hundreds of tons of martian water-ice must be harvested to produce propellant for the return vehicle, a process known as in situ resource utilization (ISRU). We describe here an instrument, the Agnostic Life Finder (ALF), that is an inexpensive life-detection add-on to ISRU. ALF exploits a well-supported view that informational genetic biopolymers in life in water must have two structural features: (1) Informational biopolymers must carry a repeating charge; they must be polyelectrolytes. (2) Their building blocks must fit into an aperiodic crystal structure; the building blocks must be size-shape regular. ALF exploits the first structural feature to extract polyelectrolytes from ?10 cubic meters of mined martian water by applying a voltage gradient perpendicularly to the water's flow. This gradient diverts polyelectrolytes from the flow toward their respective electrodes (polyanions to the anode, polycations to the cathode), where they are captured in cartridges before they encounter the electrodes. There, they can later be released to analyze their building blocks, for example, by mass spectrometry or nanopore. Upstream, martian cells holding martian informational polyelectrolytes are disrupted by ultrasound. To manage the (unknown) conductivity of the water due to the presence of salts, the mined water is preconditioned by electrodialysis using porous membranes. ALF uses only resources and technology that must already be available for ISRU. Thus, life detection is easily and inexpensively integrated into SpaceX or NASA ISRU missions.

In vitro evolution of ribonucleases from expanded genetic alphabets
Jerome, C.A; Hoshika, S.; Bradley, K.M.; Benner, S.A.; Biondi, E.
Proc. Natl. Acad. Sci. USA (2022) 119(44). DOI: 10.1073/pnas.2208261119
<Abstract>

The ability of nucleic acids to catalyze reactions (as well as store and transmit information) is important for both basic and applied science, the first in the context of molecular evolution and the origin of life and the second for biomedical applications. However, the catalytic power of standard nucleic acids (NAs) assembled from just four nucleotide building blocks is limited when compared with that of proteins. Here, we assess the evolutionary potential of libraries of nucleic acids with six nucleotide building blocks as reservoirs for catalysis. We compare the outcomes of in vitro selection experiments toward RNA-cleavage activity of two nucleic acid libraries: one built from the standard four independently replicable nucleotides and the other from six, with the two added nucleotides coming from an artificially expanded genetic information system (AEGIS). Results from comparative experiments suggest that DNA libraries with increased chemical diversity, higher information density, and larger searchable sequence spaces are one order of magnitude richer reservoirs of molecules that catalyze the cleavage of a phosphodiester bond in RNA than DNA libraries built from a standard four-nucleotide alphabet. Evolved AEGISzymes with nitro-carrying nucleobase Z appear to exploit a general acid–base catalytic mechanism to cleave that bond, analogous to the mechanism of the ribonuclease A family of protein enzymes and heavily modified DNAzymes. The AEGISzyme described here represents a new type of catalysts evolved from libraries built from expanded genetic alphabets.

Ultra-rapid detection of SARS-CoV-2 in public workspace environments
Yaren, O., McCarter, J., Phadke, N., Bradley, K. M., Overton, B., Yang, Z., Ranade, S., Patil, K., Bangale, R., Benner, S. A.
PLOS One , Public Library of Science (2021) 10.1371/journal.pone.0240524, DOI:10.1101/2020.09.29.20204131
<Abstract>

Managing the pandemic caused by SARS-CoV-2 requires new capabilities in testing, including the possibility of identifying, in minutes, infected individuals as they enter spaces where they must congregate in a functioning society, including workspaces, schools, points of entry, and commercial business establishments. Here, the only useful tests (a) require no sample transport, (b) require minimal sample manipulation, (c) can be performed by unlicensed individuals, (d) return results on the spot in much less than one hour, and (e) cost no more than a few dollars. The sensitivity need not be as high as normally required by the FDA for screening asymptomatic carriers (as few as 10 virions per sample), as these viral loads are almost certainly not high enough for an individual to present a risk for forward infection. This allows tests specifically useful for this pandemic to trade-off unneeded sensitivity for necessary speed, simplicity, and frugality. In some studies, it was shown that viral load that creates forward-infection risk may exceed 10⁵ virions per milliliter, easily within the sensitivity of an RNA amplification architecture, but unattainable by antibody-based architectures that simply target viral antigens. Here, we describe such a test based on a displaceable probe loop amplification architecture.

Abiotic Synthesis of Nucleoside 5'-Triphosphates with Nickel Borate and Cyclic Trimetaphosphate (CTMP)
Kim, H.J.., Benner, S.A.
Astrobiology (2021) 21(3), DOI:10.1089/ast.2020.2264
<Abstract>

While nucleoside 5'-triphosphates are precursors for RNA in modern biology, the presumed difficulty of making these triphosphates on Hadean Earth has caused many prebiotic researchers to consider other activated species for the prebiotic synthesis of RNA. We report here that nickel(II), in the presence of borate, gives substantial amounts (2–3%) of nucleoside 5'-triphosphates upon evaporative heating in the presence of urea, salts, and cyclic trimetaphosphate (CTMP). Also recovered are nucleoside 5'-diphosphates and nucleoside 5'-monophosphates, both likely arising from 5'-triphosphate intermediates. The total level of 5'-phosphorylation is typically 30%. Borate enhances the regiospecificity of phosphorylation, with increased amounts of other phosphorylated species seen in its absence. Experimentally supported paths are already available to make nucleosides in environments likely to have been present on Hadean Earth soon after a midsized 1021 to 1023 kg impactor, which would also have delivered nickel to the Hadean surface. Further, sources of prebiotic CTMP continue to be proposed. Thus, these results fill in one of the few remaining steps needed to demystify the prebiotic synthesis of RNA and support a continuous model from atmospheric components to oligomeric RNA that is lacking only a mechanism to obtain homochirality in the product RNA.

When Did Life Likely Emerge on Earth in an RNA-First Process?
S. A. Benner, E. A. Bell, E. Biondi, R. Brasser, T. Carell, H.-J. Kim, S. J. Mojzsis, A. Omran, M. A. Pasek, D. Trail
ChemSystemsChem 2 , Chemistry Europe (2020) e1900035
<Abstract>

The widespread presence of ribonucleic acid (RNA) catalysts and cofactors in the Earth's biosphere today suggests that RNA was the first biopolymer to support Darwinian evolution. However, most "path-hypotheses" to generate building blocks for RNA require reduced nitrogen-containing compounds not made in useful amounts in the CO2-N2-H2O atmospheres of the Hadean. We review models for Earth's impact history that invoke a single ~1023 kg impactor (Moneta) to account for measured amounts of platinum, gold, and other siderophilic ("iron-loving") elements on the Earth and Moon. If it were the last sterilizing impactor, by reducing the atmosphere but not the mantle Moneta, would have opened a "window of opportunity" for RNA synthesis, a period when RNA precursors rained from the atmosphere onto land holding oxidized minerals that stabilize advanced RNA precursors and RNA. Surprisingly, this combination of physics, geology, and chemistry suggests a time when RNA formation was most probable, ~120±100 million years after Moneta's impact, or ~4.36±0.1 billion years ago. Uncertainties in this time are driven by uncertainties in rates of productive atmosphere loss and amounts of sub-aerial land.

Eliminating Primer Dimers and Improving SNP detection using Self-Avoiding Molecular Recognition Systems (SAMRS)
Yang, Z., Le, J.T., Hutter, D., Bradley, K.M., Overton, B.R., McLendon, C., Benner, S.A.
Biol. Methods Protoc. , Oxford Academics (2020) 5(1):bpaa004, DOI:10.1093/biomethods/bpaa004
<Abstract>

Despite its widespread value to molecular biology, the polymerase chain reaction (PCR) encounters modes that unproductively consume PCR resources and prevent clean signals, especially when high sensitivity, high SNP discrimination, and high multiplexing are sought. Here, we show how "self-avoiding molecular recognition systems" (SAMRS) manage such difficulties. SAMRS nucleobases pair with complementary nucleotides with strengths comparable to the A:T pair, but do not pair with other SAMRS nucleobases. This should allow primers holding SAMRS components to avoid primer-primer interactions, preventing primer dimers, allowing more sensitive SNP detection, and supporting higher levels of multiplex PCR. The experiments here examine the PCR performances of primers containing different numbers of SAMRS components placed strategically at different positions, and put these performances in the context of estimates of SAMRS:standard pairing strengths. The impact of these variables on primer dimer formation, the overall efficiency and sensitivity of SAMRS-based PCR, and the value of SAMRS primers when detecting single nucleotide polymorphisms (SNPs) are also evaluated. With appropriately chosen polymerases, SNP discrimination can be greater than the conventional allele-specific PCR, with the further benefit of avoiding primer dimer artifacts. General rules guiding the design of SAMRS-modified primers are offered to support medical research and clinical diagnostics products.

Electrochemical Reduction and Oxidation of Eight Unnatural 2'-Deoxynucleosides at a Pyrolytic Graphite Electrode
Spacek, J., Karalkar, N., Fojta, M., Wang, J., Benner, S. A
Electrochimica acta , International Society of Electrochemistry (2020) 362:137210, DOI:10.1016/j.electacta.2020.137210
<Abstract>

Recently we showed the reduction and oxidation of six natural 2'-deoxynucleosides in the presence of the ambient oxygen using the very broad potential window of a pyrolytic graphite electrode (PGE). Using the same procedure, 2'-deoxynucleoside analogs (dNs) that are parts of an artificially expanded genetic information system (AEGIS) were analyzed. Seven of the eight tested AEGIS dNs provided specific signals (voltammetric redox peaks). These signals, described here for the first time, will be used in future work to analyze DNA built from expanded genetic alphabets, helping to further develop AEGIS technology and its applications. Comparison of the electrochemical behavior of unnatural dNs with the previously documented behaviors of natural dNs also provides insights into the mechanisms of their respective redox processes.

Hachimoji DNA and RNA: A genetic system with eight building blocks
Hoshika H, Leal N, Kim MJ, Kim MS, Karalkar NB, Kim HJ, Bates AM, Watkins Jr. NE, SantaLucia HA, Meyer AJ, DasGupta S, Piccirilli JA, Ellington AD, SantaLucia Jr. J, Georgiadis MM, Benner SA
Science (2019) 22 Feb 2019: Vol. 363, Issue 6429, pp. 884-887. DOI: 10.1126/science.aat0971
<Abstract>

We report DNA- and RNA-like systems built from eight nucleotide "letters" (hence the name "hachimoji") that form four orthogonal pairs. These synthetic systems meet the structural requirements needed to support Darwinian evolution, including a polyelectrolyte backbone, predictable thermodynamic stability, and stereoregular building blocks that fit a Schrödinger aperiodic crystal. Measured thermodynamic parameters predict the stability of hachimoji duplexes, allowing hachimoji DNA to increase the information density of natural terran DNA. Three crystal structures show that the synthetic building blocks do not perturb the aperiodic crystal seen in the DNA double helix. Hachimoji DNA was then transcribed to give hachimoji RNA in the form of a functioning fluorescent hachimoji aptamer. These results expand the scope of molecular structures that might support life, including life throughout the cosmos.

Prebiotic Chemistry that Could Not Not Have Happened
Benner S.A., Kim H.-J., and Biondi E.
Life 9 (4) , MDPI 84 (2019) https://doi.org/10.3390/life9040084
<Abstract>

We present a direct route by which RNA might have emerged in the Hadean from a fayalite-magnetite mantle, volcanic SO2 gas, and well-accepted processes that must have created substantial amounts of HCHO and catalytic amounts of glycolaldehyde in the Hadean atmosphere. In chemistry that could not not have happened, these would have generated stable bisulfite addition products that must have rained to the surface, where they unavoidably would have slowly released reactive species that generated higher carbohydrates. The formation of higher carbohydrates is self-limited by bisulfite formation, while borate minerals may have controlled aldol reactions that occurred on any semi-arid surface to capture that precipitation. All of these processes have well-studied laboratory correlates. Further, any semi-arid land with phosphate should have had phosphate anhydrides that, with NH3, gave carbohydrate derivatives that directly react with nucleobases to form the canonical nucleosides. These are phosphorylated by magnesium borophosphate minerals (e.g., luneburgite) and/or trimetaphosphate-borate with Ni2+ catalysis to give nucleoside 5'-diphosphates, which oligomerize to RNA via a variety of mechanisms. The reduced precursors that are required to form the nucleobases came, in this path-hypothesis, from one or more mid-sized (1023-1020 kg) impactors that almost certainly arrived after the Moon-forming event. Their iron metal content almost certainly generated ammonia, nucleobase precursors, and other reduced species in the Hadean atmosphere after it transiently placed the atmosphere out of redox equilibrium with the mantle. In addition to the inevitability of steps in this path-hypothesis on a Hadean Earth if it had semi-arid land, these processes may also have occurred on Mars. Adapted from a lecture by the Corresponding Author at the All-Russia Science Festival at the Lomonosov Moscow State University on 12 October 2019, and is an outcome of a three year project supported by the John Templeton Foundation and the NASA Astrobiology program. Dedicated to David Deamer, on the occasion of his 80th Birthday.

Multiplexed kit based on Luminex technology and achievements in synthetic biology discriminates Zika, chikungunya, and and four serotypes of dengue viruses in mosquitoes.
Glushakova, L.G.. Alto, B.W., Kim, M.-S., Hutter, D., Bradley, A., Bradley, K.M., Burkett-Cadena, N.D., Benner, S.A.
BMC Infect. Dis. , BioMed Central Ltd. (2019) 19:418, DOI:10.1186/s12879-019-3998-z
<Abstract>

Background The global expansion of dengue (DENV), chikungunya (CHIKV), and Zika viruses (ZIKV) is having a serious impact on public health. Because these arboviruses are transmitted by the same mosquito species and co-circulate in the same area, a sensitive diagnostic assay that detects them together, with discrimination, is needed.

Methods We present here a diagnostics panel based on reverse transcription-PCR amplification of viral RNA and an xMap Luminex architecture involving direct hybridization of PCRamplicons and virus-specific probes. Two DNA innovations ("artificially expanded genetic information systems", AEGIS, and "self-avoiding molecular recognition systems", SAMRS) increase the hybridization sensitivity on Luminex microspheres and PCR specificity of the multiplex assay compared to the standard approach (standard nucleotides).

Results The diagnostics panel detects, if they are present, these viruses with a resolution of 20 genome equivalents (DENV1), or 10 (DENV3-4, CHIKV) and 80 (DENV2, ZIKV) genome equivalents per assay. It identifies ZIKV, CHIKV and DENV RNAs in a single infected mosquito, in mosquito pools comprised of 5 to 50 individuals, and mosquito saliva (ZIKV, CHIKV, and DENV2). Infected mosquitoes and saliva were also collected on a cationic surface (Q-paper), which binds mosquito and viral nucleic acids electrostatically. All samples from infected mosquitoes displayed only target-specific signals; signals from non-infected samples were at background levels.

Conclusions Our results provide an efficient and multiplex tool that may be used for surveillance of emerging mosquito-borne pathogens which aids targeted mosquito control in areas at high risk for transmission.

Artificially Expanded Genetic Information Systems for New Aptamer Technologies
Elisa Biondi and Steven A. Benner
Biomedicines , MDPI (2018) 6, 53; doi:10.3390/biomedicines6020053
<Abstract>

Directed evolution was first applied to diverse libraries of DNA and RNA molecules a quarter century ago in the hope of gaining technology that would allow the creation of receptors, ligands, and catalysts on demand. Despite isolated successes, the outputs of this technology have been somewhat disappointing, perhaps because the four building blocks of standard DNA and RNA have too little functionality to have versatile binding properties, and offer too little information density to fold unambiguously. This review covers the recent literature that seeks to create an improved platform to support laboratory Darwinism, one based on an artificially expanded genetic information system (AEGIS) that adds independently replicating nucleotide "letters" to the evolving "alphabet".

"Skinny" and "Fat" DNA: Two New Double Helices
Hoshika S, Singh I, Switzer C, Molt RW Jr, Leal NA, Kim MJ, Kim MS, Kim HJ, Georgiadis MM, Benner SA
J. Am. Chem. Soc. (2018) Sep 19;140(37):11655-11660. doi: 10.1021/jacs.8b05042. Epub 2018 Sep 10
<Abstract>

According to the iconic model, the Watson-Crick double helix exploits nucleobase pairs that are both size complementary (big purines pair with small pyrimidines) and hydrogen bond complementary (hydrogen bond donors pair with hydrogen bond acceptors). Using a synthetic biology strategy, we report here the discovery of two new DNA-like systems that appear to support molecular recognition with the same proficiency as standard Watson-Crick DNA. However, these both violate size complementarity (big pairs with small), retaining hydrogen bond complementarity (donors pair with acceptors) as their only specificity principle. They exclude mismatches as well as standard Watson-Crick DNA excludes mismatches. In crystal structures, these "skinny" and "fat" systems form the expected hydrogen bonds, while conferring novel minor groove properties to the resultant duplex regions of the DNA oligonucleotides. Further, computational tools, previously tested primarily on natural DNA, appear to work well for these two new molecular recognition systems, offering a validation of the power of modern computational biology. These new molecular recognition systems may have application in materials science and synthetic biology, and in developing our understanding of alternative ways that genetic information might be stored and transmitted.

Multiplexed isothermal amplification based diagnostic platform to detect Zika, chikungunya, and dengue-1.
Yaren, O., Alto, B. W., Bradley, K. M., Moussatche, P., Benner, S. A.
J. Vis. Exp. , JoVE (2018) 133: e57051, DOI:10.3791/57051
<Abstract>

Zika, dengue, and chikungunya viruses are transmitted by mosquitoes, causing diseases with similar patient symptoms. However, they have different downstream patient-to-patient transmission potentials, and require very different patient treatments. Thus, recent Zika outbreaks make it urgent to develop tools that rapidly discriminate these viruses in patients and trapped mosquitoes, to select the correct patient treatment, and to understand and manage their epidemiology in real time. Unfortunately, current diagnostic tests, including those receiving 2016 emergency use authorizations and fast-track status, detect viral RNA by reverse transcription polymerase chain reaction (RT-PCR), which requires instrumentation, trained users, and considerable sample preparation. Thus, they must be sent to "approved" reference laboratories, requiring time. Indeed, in August 2016, the Center for Disease Control (CDC) was asking pregnant women who had been bitten by a mosquito and developed a Zika-indicating rash to wait an unacceptable 2 to 4 weeks before learning whether they were infected. We very much need tests that can be done on site, with few resources, and by trained but not necessarily licensed personnel. This video demonstrates an assay that meets these specifications, working with urine or serum (for patients) or crushed mosquito carcasses (for environmental surveillance), all without much sample preparation. Mosquito carcasses are captured on paper carrying quaternary ammonium groups (Q-paper) followed by ammonia treatment to manage biohazards. These are then directly, without RNA isolation, put into assay tubes containing freeze-dried reagents that need no chain of refrigeration. A modified form of reverse transcription loop-mediated isothermal amplification with target-specific fluorescently tagged displaceable probes produces readout, in 30 min, as a three-color fluorescence signal. This is visualized with a handheld, battery-powered device with an orange filter. Forward contamination is prevented with sealed tubes, and the use of thermolabile uracil DNA glycosylase (UDG) in the presence of dUTP in the amplification mixture.

Nucleoside analogs to manage sequence divergence in nucleic acid amplification and SNP detection.
Yang, Z., Kim, H.-J., Le, J., McLendon, C., Bradley, K.M., Kim, M.-S., Hutter, D., Hoshika, S., Yaren, O., Benner, S.A.
Nucl. Acids Res. (2018) 46(12): 5902-10,DOI:10.1093/nar/gky392
<Abstract>

Described here are the synthesis, enzymology and some applications of a purine nucleoside analog (H) designed to have two tautomeric forms, one complementary to thymidine (T), the other complementary to cytidine (C). The performance of H is compared by various metrics to performances of other 'biversal' analogs that similarly rely on tautomerism to complement both pyrimidines. These include (i) the thermodynamic stability of duplexes that pair these biversals with various standard nucleotides, (ii) the ability of the biversals to support polymerase chain reaction (PCR), (iii) the ability of primers containing biversals to equally amplify targets having polymorphisms in the primer binding site, and (iv) the ability of ligation-based assays to exploit the biversals to detect medically relevant single nucleotide polymorphisms (SNPs) in sequences flanked by medically irrelevant polymorphisms. One advantage of H over the widely used inosine 'universal base' and 'mixed sequence' probes is seen in ligation-based assays to detect SNPs. The need to detect medically relevant SNPs within ambiguous sequences is especially important when probing RNA viruses, which rapidly mutate to create drug resistance, but also suffer neutral drift, the second obstructing simple methods to detect the first. Thus, H is being developed to detect variants of viruses that are rapidly mutating.

Detection of Mosquito-borne Arboviruses by twenty two multiplexed xMAP Luminex Arrays Panel
Glushakova, L.G., Alto, B.W., Bradley, A., Benner, S.A.
Res. & Rev. J. Microbiol. Biotechnol. , Research & Reviews (2018) 7(2):36-40
<Abstract>

A 22-fold multiplexed Luminex panel to detect medically important arboviruses was previously reported targeting small RNAs produced in vitro. Two synthetic biology innovations made this level of multiplexing possible. We asked if multiplexing was robust when nine targets were added from the Flaviviridae, Togaviridae and Bunyaviridae families. Assay sensitivities were 10, 80, and 270 genome-equivalents for California encephalitis, chikungunya and Murray Valley encephalitis viruses, indicating the robustness of multiplexing using these innovations. Amounts of these viruses in a single infected mosquito are typically higher. Further, the panel identified Murray Valley encephalitis virus in a single infected-mosquito pooled with (5-50) uninfected mosquitoes.

Adsorption of RNA on mineral surfaces and mineral precipitates
Elisa Biondi, Yoshihiro Furukawa, Jun Kawai, and Steven A. Benner
Beilstein J. Org. Chem. , Beilstein Institute (2017) 13, 393-404
<Abstract>

The prebiotic significance of laboratory experiments that study the interactions between oligomeric RNA and mineral species is difficult to know. Natural exemplars of specific minerals can differ widely depending on their provenance. While laboratory-generated samples of synthetic minerals can have controlled compositions, they are often viewed as "unnatural". Here, we show how trends in the interaction of RNA with natural mineral specimens, synthetic mineral specimens, and co-precipitated pairs of synthetic minerals, can make a persuasive case that the observed interactions reflect the composition of the minerals themselves, rather than their being simply examples of large molecules associating nonspecifically with large surfaces. Using this approach, we have discovered Periodic Table trends in the binding of oligomeric RNA to alkaline earth carbonate minerals and alkaline earth sulfate minerals, where those trends are the same when measured in natural and synthetic minerals. They are also validated by comparison of co-precipitated synthetic minerals. We also show differential binding of RNA to polymorphic forms of calcium carbonate, and the stabilization of bound RNA on aragonite. These have relevance to the prebiotic stabilization of RNA, where such carbonate minerals are expected to have been abundant, as they appear to be today on Mars.

Tautomeric equilibria of iso-guanine and related purine analogs
Nilesh B. Karalkar, Kshitij Khare, Robert Molt, and Steven A. Benner
Nuc. Nuc. Nuc. acids , Taylor & Francis Group (2017) Apr 3;36(4):256-274. doi: 10.1080/15257770.2016.1268694
<Abstract>

Nucleobase pairs in DNA match hydrogen-bond donor and acceptor groups on the nucleobases. However, these can adopt more than one tautomeric form, and can consequently pair with nucleobases other than their canonical complements, possibly a source of natural mutation. These issues are now being revisited by synthetic biologists increasing the number of replicable pairs in DNA by exploiting unnatural hydrogen bonding patterns, where tautomerism can also create mutation. Here, we combine spectroscopic measurements on methylated analogs of isoguanine tautomers and tautomeric mixtures with statistical analyses to a set of isoguanine analogs, the complement of isocytosine, the 5th and 6th "letters" in DNA.

Prebiotic stereoselective synthesis of purine and noncanonical pyrimidine nucleotide from nucleobases and phosphorylated carbohydrates
Hyo-Joong Kim, Steven A. Benner
Proc. Natl. Acad. Sci. USA (2017) October, 114 (43) 11315-11320. https://doi.org/10.1073/pnas.1710778114
<Abstract>

According to a current "RNA first" model for the origin of life, RNA emerged in some form on early Earth to become the first biopolymer to support Darwinism here. Threose nucleic acid (TNA) and other polyelectrolytes are also considered as the possible first Darwinian biopolymer(s). This model is being developed by research pursuing a "Discontinuous Synthesis Model" (DSM) for the formation of RNA and/or TNA from precursor molecules that might have been available on early Earth from prebiotic reactions, with the goal of making the model less discontinuous. In general, this is done by examining the reactivity of isolated products from proposed steps that generate those products, with increasing complexity of the reaction mixtures in the proposed mineralogical environments. Here, we report that adenine, diaminopurine, and hypoxanthine nucleoside phosphates and a noncanonical pyrimidine nucleoside (zebularine) phosphate can be formed from the direct coupling reaction of cyclic carbohydrate phosphates with the free nucleobases. The reaction is stereoselective, giving only the β-anomer of the nucleotides within detectable limits. For purines, the coupling is also regioselective, giving the N-9 nucleotide for adenine as a major product. In the DSM, phosphorylated carbohydrates are presumed to have been available via reactions explored previously [Krishnamurthy R, Guntha S, Eschenmoser A (2000) Angew Chem Int Ed 39:2281-2285], while nucleobases are presumed to have been available from hydrogen cyanide and other nitrogenous species formed in Earth's primitive atmosphere.

Detecting Darwinism from Molecules in the Enceladus Plumes, Jupiter's Moons, and Other Planetary Water Lagoons
Steven A. Benner
Astrobiology 17 840-851 (2017) DOI: 10.1089/ast.2016.1611
<Abstract>

To the astrobiologist, Enceladus offers easy access to a potential subsurface biosphere via the intermediacy of a plume of water emerging directly into space. A direct question follows: If we were to collect a sample of this plume, what in that sample, through its presence or its absence, would suggest the presence and/or absence of life in this exotic locale? This question is, of course, relevant for life detection in any aqueous lagoon that we might be able to sample. This manuscript reviews physical chemical constraints that must be met by a genetic polymer for it to support Darwinism, a process believed to be required for a chemical system to generate properties that we value in biology. We propose that the most important of these is a repeating backbone charge; a Darwinian genetic biopolymer must be a "polyelectrolyte". Relevant to mission design, such biopolymers are especially easy to recover and concentrate from aqueous mixtures for detection, simply by washing the aqueous mixtures across a polycharged support. Several device architectures are described to ensure that, once captured, the biopolymer meets two other requirements for Darwinism, homochirality and a small building block "alphabet." This approach is compared and contrasted with alternative biomolecule detection approaches that seek homochirality and constrained alphabets in non-encoded biopolymers. This discussion is set within a model for the history of the terran biosphere, identifying points in that natural history where these alternative approaches would have failed to detect terran life. Key Words: Enceladus-Life detection-Europa-Icy moon- Biosignatures-Polyelectrolyte theory of the gene. Astrobiology 17, 840-851.

Expanded Genetic Alphabets. Managing Nucleotides that Lack Tautomeric, Protonated, or Deprotonated Versions Complementary to Natural Nucleotides
Winiger, C.B., Shaw, R.W., Kim, M.J., Moses, J.D., Matsuura, M.F. and Benner, S.A.
ACS Synthetic Biology , American Chemical Society (2017) 55(51):15816-20, DOI:10.1002/anie.201608001
<Abstract>

2,4-Diaminopyrimidine (trivially K) and imidazo[1,2-a]-1,3,5-triazine-2(8H)-4(3H)-dione (trivially X) form a nucleobase pair with Watson-Crick geometry as part of an artificially expanded genetic information system (AEGIS). Neither K nor X can form a Watson-Crick pair with any natural nucleobase. Further, neither K nor X has an accessible tautomeric form or a protonated/deprotonated state that can form a Watson-Crick pair with any natural nucleobase. In vitro experiments show how DNA polymerase I from E. coli manages replication of DNA templates with one K:X pair, but fails with templates containing two adjacent K:X pairs. In analogous in vivo experiments, E. coli lacking dKTP/dXTP cannot rescue chloramphenicol resistance from a plasmid containing two adjacent K:X pairs. These studies identify bacteria able to serve as selection environments for engineering cells that replicate AEGIS pairs that lack forms that are Watson-Crick complementary to any natural nucleobase.

Point of sampling detection of Zika virus within a multiplexed kit capable of detecting dengue and chikungunya
Yaren, O., Alto, B.W., Gangodkar, P.V., Ranade, S.R., Patil, K.N., Bradley, K.M., Yang, Z., Phadke, N., Benner, S.A
BMC Infect. Dis. , BioMed Central Ltd. (2017) 17(1):293, DOI:10.1186/s12879-017-2382-0
<Abstract>

Background: Zika, dengue, and chikungunya are three mosquito-borne viruses having overlapping transmission vectors. They cause diseases having similar symptoms in human patients, but requiring different immediate management steps. Therefore, rapid (< one hour) discrimination of these three viruses in patient samples and trapped mosquitoes is needed. The need for speed precludes any assay that requires complex up-front sample preparation, such as extraction of nucleic acids from the sample. Also precluded in robust point-of-sampling assays is downstream release of the amplicon mixture, as this risks contamination of future samples that will give false positives.

Methods: Procedures are reported that directly test urine and plasma (for patient diagnostics) or crushed mosquito carcasses (for environmental surveillance). Carcasses are captured on paper samples carrying quaternary ammonium groups (Q-paper), which may be directly introduced into the assay. To avoid the time and instrumentation requirements of PCR, the procedure uses loop-mediated isothermal amplification (LAMP). Downstream detection is done in sealed tubes, with dTTP-dUTP mixtures in the LAMP with a thermolabile uracil DNA glycosylase (UDG); this offers a second mechanism to prevent forward contamination. Reverse transcription LAMP (RT-LAMP) reagents are distributed dry without requiring a continuous chain of refrigeration.

Results: The tests detect viral RNA in unprocessed urine and other biological samples, distinguishing Zika, chikungunya, and dengue in urine and in mosquitoes infected with live Zika and chikungunya viruses. The limits of detection (LODs) are ~0.71 pfu equivalent viral RNAs for Zika, ~1.22 pfu equivalent viral RNAs for dengue, and ~38 copies of chikungunya viral RNA. A handheld, battery-powered device with an orange filter was constructed to visualize the output. Preliminary data showed that this architecture, working with pre-prepared tubes holding lyophilized reagent/enzyme mixtures and shipped without a chain of refrigeration, also worked with human plasma samples to detect chikungunya and dengue in Pune, India.

Conclusions: A kit, complete with a visualization device, is now available for point-of-sampling detection of Zika, chikungunya, and dengue. The assay output is read in ca. 30 min by visualizing (human eye) three-color coded fluorescence signals. Assay in dried format allows it to be run in low-resource environments.

Detection of chikungunya viral RNA in mosquito bodies on cationic (Q) paper based on innovations in synthetic biology
Glushakova, L.G., Alto, B.W., Kim, M.S., Bradley, A., Yaren, O., Benner, S.A.
J Virol Methods , Elsevier (2017) 246:104-11, DOI:10.1016/j.jviromet.2017.04.013
<Abstract>

Chikungunya virus (CHIKV) represents a growing and global concern for public health that needs inexpensive and convenient methods to collect mosquitoes as potential carriers so that they can be preserved, stored and transported for later and/or remote analysis. Reported here is a cellulose-based paper, derivatized with quaternary ammonium groups ("Q-paper") that meets these needs. In a series of tests, infected mosquito bodies were squashed directly on Q-paper. Aqueous ammonia was then added on the mosquito bodies to release viral RNA that adsorbed on the cationic surface via electrostatic interactions. The samples were then stored (frozen) or transported. For analysis, the CHIKV nucleic acids were eluted from the Q-paper and PCR amplified in a workflow, previously developed, that also exploited two nucleic acid innovations, ("artificially expanded genetic information systems", AEGIS, and "self-avoiding molecular recognition systems", SAMRS). The amplicons were then analyzed by a Luminex hybridization assay. This procedure detected CHIKV RNA, if present, in each infected mosquito sample, but not in non-infected counterparts or ddH2O samples washes, with testing one week or ten months after sample collection.

Assays To Detect the Formation of Triphosphates of Unnatural Nucleotides: Application to Escherichia coli Nucleoside Diphosphate Kinase
Mariko F. Matsuura, Ryan W. Shaw, Jennifer D. Moses, Hyo-Joong Kim, Myong-Jung Kim, Myong-Sang Kim, Shuichi Hoshika, Nilesh Karalkar, and Steven A. Benner
ACS Synthetic Biology , American Chemical Society (2016) 5 (3), pp 234-240 DOI: 10.1021/acssynbio.5b00172
<Abstract>

One frontier in synthetic biology seeks to move artificially expanded genetic information systems (AEGIS) into natural living cells and to arrange the metabolism of those cells to allow them to replicate plasmids built from these unnatural genetic systems. In addition to requiring polymerases that replicate AEGIS oligonucleotides, such cells require metabolic pathways that biosynthesize the triphosphates of AEGIS nucleosides, the substrates for those polymerases. Such pathways generally require nucleoside and nucleotide kinases to phosphorylate AEGIS nucleosides and nucleotides on the path to these triphosphates. Thus, constructing such pathways focuses on engineering natural nucleoside and nucleotide kinases, which often do not accept the unnatural AEGIS biosynthetic intermediates. This, in turn, requires assays that allow the enzyme engineer to follow the kinase reaction, assays that are easily confused by ATPase and other spurious activities that might arise through "site-directed damage" of the natural kinases being engineered. This article introduces three assays that can detect the formation of both natural and unnatural deoxyribonucleoside triphosphates, assessing their value as polymerase substrates at the same time as monitoring the progress of kinase engineering. Here, we focus on two complementary AEGIS nucleoside diphosphates, 6-amino-5-nitro-3- (1'-B-D-2'-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1'-B-D-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin- 4(8H)-one. These assays provide new ways to detect the formation of unnatural deoxyribonucleoside triphosphates in vitro and to confirm their incorporation into DNA. Thus, these assays can be used with other unnatural nucleotides.

Synthesis and Enzymology of 2'-Deoxy-7-deazaisoguanosine Triphosphate and Its Complement: A Second Generation Pair in an Artificially Expanded Genetic Information System
Karalkar NB, Leal NA, Kim MS, Bradley KM, Benner SA
ACS Synthetic Biology , American Chemical Society (2016) doi: 10.1021/acssynbio.5b00276
<Abstract>

As with natural nucleic acids, pairing between artificial nucleotides can be influenced by tautomerism, with different placements of protons on the heterocyclic nucleobase changing patterns of hydrogen bonding that determine replication fidelity. For example, the major tautomer of isoguanine presents a hydrogen bonding donor-donor-acceptor pattern complementary to the acceptor-acceptor-donor pattern of 5-methylisocytosine. However, in its minor tautomer, isoguanine presents a hydrogen bond donor-acceptor-donor pattern complementary to thymine. Calculations, crystallography, and physical organic experiments suggest that this tautomeric ambiguity might be "fixed" by replacing the N-7 nitrogen of isoguanine by a CH unit. To test this hypothesis, we prepared the triphosphate of 2'-deoxy-7-deazaiso-guanosine and used it in PCR to estimate an effective tautomeric ratio "seen" by Taq DNA polymerase. With 7-deazaisoguanine, fidelity-per-round was ~92%. The analogous PCR with isoguanine gave a lower fidelity-per-round of ~86%. These results confirm the hypothesis with polymerases, and deepen our understanding of the role of minor groove hydrogen bonding and proton tautomerism in both natural and expanded genetic "alphabets", major targets in synthetic biology.

Standard and AEGIS nicking molecular beacons detect amplicons from the Middle East respiratory syndrome coronavirus
Ozlem Yaren, Lyudmyla G. Glushakova, Kevin M. Bradley, Shuichi Hoshika,Steven A. Benner
J Virol Methods (236) , Elsevier 54-61 (2016) doi:10.1016/j.jviromet.2016.07.008
<Abstract>

This paper combines two advances to detect MERS-CoV, the causative agent of Middle East Respiratory Syndrome, that have emerged over the past few years from the new field of "synthetic biology". Both are based on an older concept, where molecular beacons are used as the downstream detection of viral RNA in biological mixtures followed by reverse transcription PCR amplification. The first advance exploits the artificially expanded genetic information systems (AEGIS). AEGIS adds nucleotides to the four found in standard DNA and RNA (xNA); AEGIS nucleotides pair orthogonally to the A:T and G:C pairs. Placing AEGIS components in the stems of molecular beacons is shown to lower noise by preventing unwanted stem invasion by adventitious natural xNA. This should improve the signal-to-noise ratio of molecular beacons operating in complex biological mixtures. The second advance introduces a nicking enzyme that allows a single target molecule to activate more than one beacon, allowing "signal amplification". Combining these technologies in primers with components of a self-avoiding molecular recognition system (SAMRS), we detect 50 copies of MERS-CoV RNA in a multiplexed respiratory virus panel by generating fluorescence signal visible to human eye and/or camera.

A norovirus detection architecture based on isothermal amplification and expanded genetic systems
Ozlem Yaren, Kevin M. Bradley, Patricia Moussatche, Shuichi Hoshika, Zunyi Yang,Shu Zhu, Stephanie M. Karst, Steven A. Benner
J Virol Methods (237) , Elsevier 64-71 (2016) doi: 10.1016/j.jviromet.2016.08.012
<Abstract>

Noroviruses are the major cause of global viral gastroenteritis with short incubation times and small inoculums required for infection. This creates a need for a rapid molecular test for norovirus for early diagnosis, in the hope of preventing the spread of the disease. Non-chemists generally use off-the shelf reagents and natural DNA to create such tests, suffering from background noise that comes from adventitious DNA and RNA (collectively xNA) that is abundant in real biological samples, especially feces, a common location for norovirus. Here, we create an assay that combines artificially expanded genetic information systems (AEGIS, which adds nucleotides to the four in standard xNA, pairing orthogonally to A:T and G:C) with loop-mediated isothermal amplification (LAMP) to amplify norovirus RNA at constant temperatures, without the power or instrument requirements of PCR cycling. This assay was then validated using feces contaminated with murine norovirus (MNV). Treating stool samples with ammonia extracts the MNV RNA, which is then amplified in an AEGIS-RT-LAMP where AEGIS segments are incorporated both into an internal LAMP primer and into a molecular beacon stem, the second lowering background signaling noise. This is coupled with RNase H nicking during sample amplification, allowing detection of as few as 10 copies of noroviral RNA in a stool sample, generating a fluorescent signal visible to human eye, all in a closed reaction vessel.

Polymerase Interactions with Wobble Mismatches in Synthetic Genetic Systems and Their Evolutionary Implications
Christian B. Winiger, Myong-Jung Kim, Shuichi Hoshika, Ryan W. Shaw, Jennifer D. Moses, Mariko F. Matsuura, Dietlind L. Gerloff, and Steven A. Benner
Biochemistry 55 (28) 3847-3850 (2016) DOI: 10.1021/acs.biochem.6b00533
<Abstract>

In addition to completing the Watson-Crick nucleobase matching "concept" (big pairs with small, hydrogen bond donors pair with hydrogen bond acceptors), artificially expanded genetic information systems (AEGIS) also challenge DNA polymerases with a complete set of mismatches, including wobble mismatches. Here, we explore wobble mismatches with AEGIS with DNA polymerase 1 from Escherichia coli. Remarkably, we find that the polymerase tolerates an AEGIS:standard wobble that has the same geometry as the G:T wobble that polymerases have evolved to exclude but excludes a wobble geometry that polymerases have never encountered in natural history. These results suggest certain limits to "structural analogy" and "evolutionary guidance" as tools to help synthetic biologists expand DNA alphabets.

Laboratory evolution of artificially expanded DNA gives redesignable aptamers that target the toxic form of anthrax protective antigen
Biondi E, Lane JD, Das D, Dasgupta S, Piccirilli JA, Hoshika S, Bradley KM, Krantz BA, Benner SA
Nucl. Acids Res. (2016) Oct 3. pii: gkw890. PubMed PMID: 27701076
<Abstract>

Reported here is a laboratory in vitro evolution (LIVE) experiment based on an artificially expanded genetic information system (AEGIS). This experiment delivers the first example of an AEGIS aptamer that binds to an isolated protein target, the first whose structural contact with its target has been outlined and the first to inhibit biologically important activities of its target, the protective antigen from Bacillus anthracis. We show how rational design based on secondary structure predictions can also direct the use of AEGIS to improve the stability and binding of the aptamer to its target. The final aptamer has a dissociation constant of ~35 nM. These results illustrate the value of AEGIS-LIVE for those seeking to obtain receptors and ligands without the complexities of medicinal chemistry, and also challenge the biophysical community to develop new tools to analyze the spectroscopic signatures of new DNA folds that will emerge in synthetic genetic systems replacing standard DNA and RNA as platforms for LIVE.

Crystal structures of deprotonated nucleobases from an expanded DNA alphabet
Mariko F. Matsuura, Hyo-Joong Kim, Daisuke Takahashi, Khalil A. Abboud and Steven A. Benner
Structural Chemistry , Acta Crystallographica (2016) C72, 952-959. doi: 10.1107/S2053229616017071
<Abstract>

Reported here is the crystal structure of a heterocycle that implements a donor–donor–acceptor hydrogen-bonding pattern, as found in the Z component [6-amino-5-nitropyridin-2(1H)-one] of an artificially expanded genetic information system (AEGIS). AEGIS is a new form of DNA from synthetic biology that has six replicable nucleotides, rather than the four found in natural DNA. Remarkably, Z crystallizes from water as a 1:1 complex of its neutral and deprotonated forms, and forms a ‘skinny’ pyrimidine–pyrimidine pair in this structure. The pair resembles the known intercalated cytosine pair. The formation of the same pair in two different salts, namely poly[[aqua(µ6-2-amino-6-oxo-3-nitro-1,6-dihydropyridin-1-ido)sodium]–6-amino-5-nitropyridin-2(1H)-one–water (1/1/1)], denoted Z-Sod, {[Na(C5H4N3O3)(H2O)]·C5H5N3O3·H2O}n, and ammonium 2-amino-6-oxo-3-nitro-1,6-dihydropyridin-1-ide–6-amino-5-nitropyridin-2(1H)-one–water (1/1/1), denoted Z-Am, NH4+·C5H4N3O3·C5H5-N3O3·H2O, under two different crystallization conditions suggests that the pair is especially stable. Implications of this structure for the use of this heterocycle in artificial DNA are discussed.

A Single Deoxynucleoside Kinase Variant from Drosophila melanogaster Synthesizes Monophosphates of Nucleosides That Are Components of an Expanded Genetic System
Mariko F. Matsuura, Christian B. Winiger, Ryan W. Shaw, Myong-Jung Kim, Myong-Sang Kim, Ashley B. Daugherty, Fei Chen, Patricia Moussatche, Jennifer D. Moses, Stefan Lutz, and Steven A. Benner
ACS Synthetic Biology , American Chemical Society (2016) DOI: 10.1021/acssynbio.6b00228
<Abstract>

ABSTRACT: Deoxynucleoside kinase from D. melanogaster (DmdNK) has broad specificity; although it catalyzes the phosphorylation of natural pyrimidine more efficiently than natural purine nucleosides, it accepts all four 2'-deoxynucleosides and many analogues, using ATP as a phosphate donor to give the corresponding deoxynucleoside monophosphates. Here, we show that replacing a single amino acid (glutamine 81 by glutamate) in DmdNK creates a variant that also catalyzes the phosphorylation of nucleosides that form part of an artificially expanded genetic information system (AEGIS). By shuffling hydrogen bonding groups on the nucleobases, AEGIS adds potentially as many as four additional nucleobase pairs to the genetic "alphabet". Specifically, we show that DmdNK Q81E creates the monophosphates from the AEGIS nucleosides dP, dZ, dX, and dK (respectively 2-amino-8-(1'-β-D-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one, dP; 6-amino-3-(1'-β-D-2'-deoxyribofuranosyl)-5-nitro-1H-pyridin-2-one, dZ; 8-(1'β-D-2'-deoxy-ribofuranosyl)imidazo[1,2-a]-1,3,5-triazine-2(8H)-4(3H)-dione, dX; and 2,4-diamino-5-(1'-β-D-2'-deoxyribofuranosyl)-pyrimidine, dK). Using a coupled enzyme assay, in vitro kinetic parameters were obtained for three of these nucleosides (dP, dX, and dK; the UV absorbance of dZ made it impossible to get its precise kinetic parameters). Thus, DmdNK Q81E appears to be a suitable enzyme to catalyze the first step in the biosynthesis of AEGIS 2'-deoxynucleoside triphosphates in vitro and, perhaps, in vivo, in a cell able to manage plasmids containing AEGIS DNA.

Evaporite Borate-Containing Mineral Ensembles Make Phosphate Available and Regiospecifically Phosphorylate Ribonucleosides: Borate as a Multifaceted Problem Solver in Prebiotic Chemistry
Kim, H.J., Furukawa, Y., Kakegawa, T., Bita, A., Scorei, R. and Benner, S.A.
Angew. Chem. Int. Ed. (2016) 55(51):15816-20, DOI:10.1002/anie.201608001
<Abstract>

RNA is currently thought to have been the first biopolymer to support Darwinian natural selection on Earth. However, the phosphate esters in RNA and its precursors, and the many sites at which phosphorylation might occur in ribonucleosides under conditions that make it possible, challenge prebiotic chemists. Moreover, free inorganic phosphate may have been scarce on early Earth owing to its sequestration by calcium in the unreactive mineral hydroxyapatite. Herein, it is shown that these problems can be mitigated by a particular geological environment that contains borate, magnesium, sulfate, calcium, and phosphate in evaporite deposits. Actual geological environments, reproduced here, show that Mg²⁺ and borate sequester phosphate from calcium to form the mineral lüneburgite. Ribonucleosides stabilized by borate mobilize borate and phosphate from lüneburgite, and are then regiospecifically phosphorylated by the mineral. Thus, in addition to guiding carbohydrate pre-metabolism, borate minerals in evaporite geoorganic contexts offer a solution to the phosphate problem in the "RNA first" model for the origins of life.

Evolution of functional six-nucleotide DNA
Zhang, L., Yang, Z., Sefah, K., Bradley, K. M., Hoshika, S., Kim, M-J,. Kim, H-J., Zhu., Jimenez, E., Cansiz, S., Teng, I-T., Champanhac, C, McLendon, C., Liu, C., Zhang, W., Gerloff, D. L., Huang, Z., Tan, W., Benner, S. A.
J. Am. Chem. Soc. (2015) DOI: 10.1021/jacs.5b02251
<Abstract>

Axiomatically, the density of information stored in DNA, with just four nucleotides (GACT), is higher than in a binary code, but less than it might be if synthetic biologists succeed in adding independently replicating nucleotides to genetic systems. Such addition could also add additional functional groups, not found in natural DNA but useful for molecular performance. Here, we consider two new nucleotides (Z and P, 6-amino-5- nitro-3-(1'-B-D-2'-deoxyribo-furanosyl)-2(1H)-pyridone and 2-amino-8-(1'-B-D-2'-deoxyribofuranosyl)-imidazo- [1,2-a]-1,3,5-triazin-4(8H)-one). These are designed to pair via strict Watson?Crick geometry. These were added to a laboratory in vitro evolution (LIVE) experiment; the GACTZP library was challenged to deliver molecules that bind selectively to liver cancer cells, but not to untransformed liver cells. Unlike in classical in vitro selection systems, low levels of mutation allow this system to evolve to create binding molecules not necessarily present in the original library. Over a dozen binding species were recovered. The best had Z and/or P in their sequences. Several had multiple, nearby, and adjacent Zs and Ps. Only the weaker binders contained no Z or P at all. This suggests that this system explored much of the sequence space available to this genetic system and that GACTZP libraries are richer reservoirs of functionality than standard libraries.

Pooled assembly of marine metagenomic datasets: enriching annotation through chimerism
Jonathan D. Magasin and Dietlind L. Gerloff
Bioinformatics (2015) 31:311-317
<Abstract>

Motivation: Despite advances in high-throughput sequencing, marine metagenomic samples remain largely opaque. A typical sample contains billions of microbial organisms from thousands of genomes and quadrillions of DNA base pairs. Its derived metagenomic dataset underrepresents this complexity by orders of magnitude because of the sparseness and shortness of sequencing reads. Read shortness and sequencing errors pose a major challenge to accurate species and functional annotation. This includes distinguishing known from novel species. Often the majority of reads cannot be annotated and thus cannot help our interpretation of the sample.
Results: Here, we demonstrate quantitatively how careful assembly of marine metagenomic reads within, but also across, datasets can alleviate this problem. For 10 simulated datasets, each with species complexity modeled on a real counterpart, chimerism remained within the same species for most contigs (97%). For 42 real pyrosequencing ('454') datasets, assembly increased the proportion of annotated reads, and even more so when datasets were pooled, by on average 1.6% (max 6.6%) for species, 9.0% (max 28.7%) for Pfam protein domains and 9.4% (max 22.9%) for PANTHER gene families. Our results outline exciting prospects for data sharing in the metagenomics community. While chimeric sequences should be avoided in other areas of metagenomics (e.g. biodiversity analyses), conservative pooled assembly is advantageous for annotation specificity and sensitivity. Intriguingly, our experiment also found potentia

Detecting respiratory viral RNA using expanded genetic alphabets and self-avoiding DNA
Lyudmyla G. Glushakova, Nidhi Sharma, Shuichi Hoshika, Andrea C. Bradley, Kevin M. Bradley, Zunyi Yang, Steven A. Benner
Anal Biochem , Elsevier (2015) Nov 15;489:62-72. doi: 10.1016/j.ab.2015.08.015
<Abstract>

Nucleic acid (NA)-targeted tests detect and quantify viral DNA and RNA (collectively xNA) to support epidemiological surveillance and, in individual patients, to guide therapy. They commonly use polymerase chain reaction (PCR) and reverse transcription PCR. Although these all have rapid turnaround, they are expensive to run. Multiplexing would allow their cost to be spread over multiple targets, but often only with lower sensitivity and accuracy, noise, false positives, and false negatives; these arise by interactions between the multiple nucleic acid primers and probes in a multiplexed kit. Here we offer a multiplexed assay for a panel of respiratory viruses that mitigates these problems by combining several nucleic acid analogs from the emerging field of synthetic biology: (i) self-avoiding molecular recognition systems (SAMRSs), which facilitate multiplexing, and (ii) artificially expanded genetic information systems (AEGISs), which enable low-noise PCR. These are supplemented by "transliteration" technology, which converts standard nucleotides in a target to AEGIS nucleotides in a product, improving hybridization. The combination supports a multiplexed Luminex-based respiratory panel that potentially differentiates influenza viruses A and B, respiratory syncytial virus, severe acute respiratory syndrome coronavirus (SARS), and Middle East respiratory syndrome (MERS) coronavirus, detecting as few as 10 MERS virions in a 20-ml sample.

High-throughput multiplexed xMAP Luminex array panel for detection of twenty two medically important mosquito-borne arboviruses based on innovations in synthetic biology
Lyudmyla G. Glushakova, Andrea Bradley, Kevin M. Bradley, Barry W. Alto, Shuichi Hoshika, Daniel Hutter, Nidhi Sharma, Zunyi Yang, Myong-Jung Kim, Steven A. Benner
J Virol Methods 214 , Elsevier 60-74 (2015) doi: 10.1016/j.jviromet.2015.01.003
<Abstract>

Mosquito-borne arboviruses are emerging world-wide as important human and animal pathogens. This makes assays for their accurate and rapid identification essential for public health, epidemiological, ecological studies. Over the past decade, many mono- and multiplexed assays targeting arboviruses nucleic acids have been reported. None has become established for the routine identification of multiple viruses in a "single tube" setting. With increasing multiplexing, the detection of viral RNAs is complicated by noise, false positives and negatives. In this study, an assay was developed that avoids these problems by combining two new kinds of nucleic acids emerging from the field of synthetic biology. The first is a "self-avoiding molecular recognition system" (SAMRS), which enables high levels of multiplexing. The second is an "artificially expanded genetic information system" (AEGIS), which enables clean PCR amplification in nested PCR formats. A conversion technology was used to place AEGIS component into amplicon, improving their efficiency of hybridization on Luminex beads. When Luminex "liquid microarrays" are exploited for downstream detection, this combination supports single-tube PCR amplification assays that can identify 22 mosquito-borne RNA viruses from the genera Flavivirus, Alphavirus, Orthobunyavirus. The assay differentiates between closely-related viruses, as dengue, West Nile, Japanese encephalitis, and the California serological group. The performance and the sensitivity of the assay were evaluated with dengue viruses and infected mosquitoes; as few as 6-10 dengue virions can be detected in a single mosquito.

Ribonucleosides for an Artificially Expanded Genetic Information System
Hyo-Joong Kim, Nicole A. Leal, Shuichi Hoshika, Steven A. Benner
J. Org. Chem. (2014) 79 (7), pp 3194-3199
<Abstract>

Rearranging hydrogen bonding groups adds nucleobases to an artificially expanded genetic information system (AEGIS), pairing orthogonally to standard nucleotides. We report here a large-scale synthesis of the AEGIS nucleotide carrying 2-amino-3-nitropyridin-6-one (trivially Z) via Heck coupling and a hydroboration/oxidation sequence. RiboZ is more stable against epimerization than its 2?-deoxyribo analogue. Further, T7 RNA polymerase incorporates ZTP opposite its Watson?Crick complement,imidazo[1,2-a]-1,3,5-triazin-4(8H)one (trivially P), laying grounds for using this "second-generation" AEGIS Z:P pair to add amino acids encoded by mRNA.

Hominids adapted to metabolize ethanol long before human-directed fermentation
Matthew A. Carrigan, Oleg Uryasev, Carole B. Frye, Blair L. Eckman, Candace R. Myers, Thomas D. Hurley, and Steven A. Benner
Proc. Natl. Acad. Sci. USA 112 (2) 458-463 (2014) doi: 10.1073/pnas.1404167111
<Abstract>

Paleogenetics is an emerging field that resurrects ancestral proteins from now-extinct organisms to test, in the laboratory, models of protein function based on natural history and Darwinian evolution. Here, we resurrect digestive alcohol dehydrogenases (ADH4) from our primate ancestors to explore the history of primate–ethanol interactions. The evolving catalytic properties of these resurrected enzymes show that our ape ancestors gained a digestive dehydrogenase enzyme capable of metabolizing ethanol near the time that they began using the forest floor, about 10 million y ago. The ADH4 enzyme in our more ancient and arboreal ancestors did not efficiently oxidize ethanol. This change suggests that exposure to dietary sources of ethanol increased in hominids during the early stages of our adaptation to a terrestrial lifestyle. Because fruit collected from the forest floor is expected to contain higher concentrations of fermenting yeast and ethanol than similar fruits hanging on trees, this transition may also be the first time our ancestors were exposed to (and adapted to) substantial amounts of dietary ethanol.

MaGnET: Malaria Genome Exploration Tool
Sharman JL and Gerloff DL
Bioinformatics (2013) 2350-2352
<Abstract>

Summary: The Malaria Genome Exploration Tool (MaGnET) is a software tool enabling intuitive 'exploration-style' visualization of functional genomics data relating to the malaria parasite, Plasmodium falciparum. MaGnET provides innovative integrated graphic displays for different datasets, including genomic location of genes, mRNA expression data, protein–protein interactions and more. Any selection of genes to explore made by the user is easily carried over between the different viewers for different datasets, and can be changed interactively at any point (without returning to a search).

Availability and Implementation: Free online use (Java Web Start) or download (Java application archive and MySQL database; requires local MySQL installation) at http://malariagenomeexplorer.org
Contact: joanna.sharman@ed.ac.uk or dgerloff@ffame.org

Directed Evolution of Polymerases To Accept Nucleotides with Nonstandard Hydrogen Bond Patterns
Laos R, Shaw R, Leal NA, Gaucher E, Benner S.
Biochemistry (2013) 52, 5288-5294
<Abstract>

Artificial genetic systems have been developed by synthetic biologists over the past two decades to include additional nucleotides that form additional nucleobase pairs independent of the standard T:A and C:G pairs. Their use in various tools to detect and analyze DNA and RNA requires polymerases that synthesize duplex DNA containing unnatural base pairs. This is especially true for nested polymerase chain reaction (PCR), which has been shown to dramatically lower noise in multiplexed nested PCR if nonstandard nucleotides are used in their external primers. We report here the results of a directed evolution experiment seeking variants of Taq DNA polymerase that can support the nested PCR amplification with external primers containing two particular nonstandard nucleotides, 2-amino-8-(1'-B-D-2'-deoxyribofuranosyl)imidazo[1,2-a]-1,3,5-triazin-4(8H)-one (trivially called P) that pairs with 6-amino-5-nitro-3-(1'-B-D-2'-deoxyribofuranosyl)-2(1H)-pyridone (trivially called Z). Variants emerging from the directed evolution experiments were shown to pause less when challenged in vitro to incorporate dZTP opposite P in a template. Interestingly, several sites involved in the adaptation of Taq polymerases in the laboratory were also found to have displayed "heterotachy" (different rates of change) in their natural history, suggesting that these sites were involved in an adaptive change in natural polymerase evolution. Also remarkably, the polymerases evolved to be less able to incorporate dPTP opposite Z in the template, something that was not selected. In addition to being useful in certain assay architectures, this result underscores the general rule in directed evolution that "you get what you select for".

Conversion strategy using an expanded genetic alphabet to assay nucleic acids
Yang, Z., Durante, M., Glushakova, L., Sharma, N., Leal, N., Bradley, K., Chen, F., Benner, S. A.
Anal. Chem. (2013) 85(9):4705-12
<Abstract>

Methods to detect DNA and RNA (collectively xNA) are easily plagued by noise, false positives, and false negatives, especially with increasing levels of multiplexing in complex assay mixtures. Here, we describe assay architectures that mitigate these problems by converting standard xNA analyte sequences into sequences that incorporate nonstandard nucleotides (Z and P). Z and P are extra DNA building blocks that form tight nonstandard base pairs without cross-binding to natural oligonucleotides containing G, A, C, and T (GACT). The resulting improvements are assessed in an assay that inverts the standard Luminex xTAG architecture, placing a biotin on a primer (rather than on a triphosphate). This primer is extended on the target to create a standard GACT extension product that is captured by a CTGA oligonucleotide attached to a Luminex bead. By using conversion, a polymerase incorporates dZTP opposite template dG in the absence of dCTP. This creates a Z-containing extension product that is captured by a bead-bound oligonucleotide containing P, which binds selectively to Z. The assay with conversion produces higher signals than the assay without conversion, possibly because the Z/P pair is stronger than the C/G pair. These architectures improve the ability of the Luminex instruments to detect xNA analytes, producing higher signals without the possibility of competition from any natural oligonucleotides, even in complex biological samples.

Use of codon models in molecular dating and functional analysis
Steven A. Benner
Codon Evolution , ed. Gina M Cannarozzi, Adrian Scheider , Oxford University Press 133-144 (2012)

Synthesis and Properties of 5-Cyano-Substituted Nucleoside Analog with a Donor-Donor-Acceptor Hydrogen-Bonding Pattern
Hyo-Joong Kim, Fei Chen, and Steven A. Benner
J. Org. Chem. (2012)
<Abstract>

6-Aminopyridin-2-ones form Watson-Crick pairs with complementary purine analogues to add a third nucleobase pair to DNA and RNA, if an electron-withdrawing group at position 5 slows oxidation and epimerization. In previous work with a nucleoside analogue trivially named dZ, the electron withdrawing unit was a nitro group. Here, we describe an analogue of dZ (cyano-dZ) having a cyano group instead of a nitro group, including its synthesis, pKa, rates of acid-catalyzed epimerization, and enzymatic incorporation.

The Natural History of Class I Primate Alcohol Dehydrogenases Includes Gene Duplication, Gene Loss, and Gene Conversion
Matthew A. Carrigan, Oleg Uryasev, Ross P. Davis, LanMin Zhai, Thomas D. Hurley, Steven A. Benner
PLOS One 7 (7) , Public Library of Science (2012)

Recognition of an expanded genetic alphabet by type-II restriction endonucleases and their application to analyze polymerase fidelity.
Chen, F; Yang, ZY; Yan, M; Alvarado, JB; Wang, G; Benner, SA
Nucl. Acids Res. 39 (9) 3949-3961 (2011)
<Abstract>

To explore the possibility of using restriction enzymes in a synthetic biology based on artificially expanded genetic information systems (AEGIS), 24 type-II restriction endonucleases (REases) were challenged to digest DNA duplexes containing recognition sites where individual Cs and Gs were replaced by the AEGIS nucleotides Z and P [respectively, 6-amino-5-nitro-3-(1'-?-d-2'-deoxyribofuranosyl)-2(1H)-pyridone and 2-amino-8-(1'-?-d-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin-4(8H)-one]. These AEGIS nucleotides implement complementary hydrogen bond donor-donor-acceptor and acceptor-acceptor-donor patterns. Results allowed us to classify type-II REases into five groups based on their performance, and to infer some specifics of their interactions with functional groups in the major and minor grooves of the target DNA. For three enzymes among these 24 where crystal structures are available (BcnI, EcoO109I and NotI), these interactions were modeled. Further, we applied a type-II REase to quantitate the fidelity polymerases challenged to maintain in a DNA duplex C:G, T:A and Z:P pairs through repetitive PCR cycles. This work thus adds tools that are able to manipulate this expanded genetic alphabet in vitro, provides some structural insights into the working of restriction enzymes, and offers some preliminary data needed to take the next step in synthetic biology to use an artificial genetic system inside of living bacterial cells.

Synthetic Biology, Tinkering Biology, and Artificial Biology: A Perspective from Chemistry
Benner, SA; Chen, F; Yang, ZY
Chemical Synthetic Biology , ed. Pier Luigi Luisi and Cristiano Chiarabelli , Wiley 69-106 (2011)

Setting the Stage: The History, Chemistry, and Geobiology behind RNA
Benner, SA; Kim, HJ; Yang, ZY
RNA Worlds: From Life's Origins to Diversity in Gene Regulation , ed. John F. Atkins, Raymond F. Gesteland, Thomas R. Cech , Cold Spring Harbor Laboratory Press 7-19 (2011)

Amplification, Mutation, and Sequencing of a Six-Letter Synthetic Genetic System
Yang, Z; Chen, F; Alvarado, JB; Benner, SA
J. Am. Chem. Soc. 133 (38) 15105-15112 (2011) dx.doi.org/10.1021/ja204910n
<Abstract>

The next goals in the development of a synthetic biology that uses artificial genetic systems will require chemistry-biology combinations that allow the amplification of DNA containing any number of sequential and nonsequential nonstandard nucleotides. This amplification must ensure that the nonstandard nucleotides are not unidirectionally lost during PCR amplification (unidirectional loss would cause the artificial system to revert to an all-natural genetic system). Further, technology is needed to sequence artificial genetic DNA molecules. The work reported here meets all three of these goals for a sixletter artificially expanded genetic information system (AEGIS) that comprises four standard nucleotides (G, A, C, and T) and two additional nonstandard nucleotides (Z and P). We report polymerases and PCR conditions that amplify a wide range of GACTZP DNA sequences having multiple consecutive unnatural synthetic genetic components with low (0.2% per theoretical cycle) levels of mutation. We demonstrate that residual mutation processes both introduce and remove unnatural nucleotides, allowing the artificial genetic system to evolve as such, rather than revert to a wholly natural system. We then show that mechanisms for these residual mutation processes can be exploited in a strategy to sequence "six-letter" GACTZP DNA. These are all not yet reported for any other synthetic genetic system.

Labeled nucleoside triphosphates with reversibly terminating aminoalkoxyl groups
Hutter, D; Kim, MJ; Karalkar, N; Leal, NA; Chen, F; Guggenheim, E; Visalakshi, V; Olejnik, J; Gordon, S; Benner, SA
Nuc. Nuc. Nuc. acids 29 (11) , Taylor & Francis Group 879-895 (2010)
<Abstract>

Nucleoside triphosphates having a 3'-ONH(2) blocking group have been prepared with and without fluorescent tags on their nucleobases. DNA polymerases were identified that accepted these, adding a single nucleotide to the 3'-end of a primer in a template-directed extension reaction that then stops. Nitrite chemistry was developed to cleave the 3'-ONH(2) group under mild conditions to allow continued primer extension. Extension-cleavage-extension cycles in solution were demonstrated with untagged nucleotides and mixtures of tagged and untagged nucleotides. Multiple extension-cleavage-extension cycles were demonstrated on an Intelligent Bio-Systems Sequencer, showing the potential of the 3'-ONH(2) blocking group in "next generation sequencing."

Expanded Genetic Alphabets in the Polymerase Chain Reaction
Yang, ZY; Chen, F; Chamberlin, SG; Benner, SA
Angew. Chem. Int. Ed. 49 (1) 177-180 (2010)

Artificial Genetic Systems: Self-Avoiding DNA in PCR and Multiplexed PCR
Hoshika, S; Chen, F; Leal, NA; Benner, SA
Angew. Chem. Int. Ed. 49 (32) 5554-5557 (2010)

Reconstructed evolutionary adaptive paths give polymerases accepting reversible terminators for sequencing and SNP detection
Chen, F; Gaucher, EA; Leal, NA; Hutter, D; Havemann, SA; Govindarajan, S; Ortlund, EA; Benner, SA
Proc. Natl. Acad. Sci. USA 107 (5) 1948-1953 (2010)

Chemistry, Life, and the Search for Aliens
Benner, SA
Proc. SPIE 7819 (10) 1-12 (2010)
<Abstract>

While "life" may universally be a self-sustaining chemical system capable of Darwinian evolution, alien life may be quite different in its chemistry from the terran life that we know here on Earth. In this case, it will be difficult to recognize, especially if it has not advanced beyond the single cell life forms that have dominated much of the terran biosphere. This review summarizes what we might infer from general physical and chemical law about how such "weird" life might be structured, what solvents other than water it might inhabit, what genetic molecules it might contain, and what metabolism it might exploit.

Interview with Steven Benner
Impey, C; Benner, SA
Talking about Life: Conversations on Astrobiology , ed. Chris Impey , Cambridge University Press 58-68 (2010)

Q&A: Life, synthetic biology and risk
Benner, SA
BMC Biology 8 (77) (2010) doi:10.1186/1741-7007-8-77

2'-Deoxy-1-methylpseudocytidine, a stable analog of 2'-deoxy-5-methylisocytidine
Kim, HJ; Leal, NA; Benner, SA
Bioorg. Med. Chem. 17 (10) 3728-3732 (2009)
<Abstract>

2 '-Deoxy-5-methylisocytidine is widely used in assays to personalize the care of patients infected with HIV, hepatitis C, and other infectious agents. However, oligonucleotides that incorporate 2'-deoxy-5-methylisocytidine are expensive, because of its intrinsic chemical instability. We report here a C-glycoside analog that is more stable and, in oligonucleotides, pairs with 2 '-deoxyisoguanosine, contributing to duplex stability about as much as a standard 2 '-deoxycytidine and 2 '-deoxyguanosine pair. (C) 2009 Elsevier Ltd. All rights reserved.

A Convenient Synthesis of N,N'-dibenzyl-2,4-diaminopyrimidine-2'-deoxyribonucleoside and 1-Methyl-2'-Deoxypseudoisocytidine
Wellington, KW; Ooi, HC; Benner, SA
Nuc. Nuc. Nuc. acids 28 (4) , Taylor & Francis Group 275-291 (2009)
<Abstract>

The syntheses of N,N'-dibenzyl-2,4-diaminopyrimidine-2'-deoxyribonucleoside and 1-methyl-2'-deoxypseudoisocytidine via Heck coupling are described. A survey of the attempts to use the Heck coupling to synthesize N,N'-dibenzyl-2,4-diaminopyrimidine-2'-deoxyribonucleoside is provided, indicating a remarkable diversity in outcome depending on the specific heterocyclic partner used.

Signatures of a Shadow Biosphere
Davies, PCW; Benner, SA; Cleland, CE; Lineweaver, CH; McKay, CP; Wolfe-Simon, F
Astrobiology 9 (2) 241-249 (2009)
<Abstract>

Astrobiologists are aware that extraterrestrial life might differ from known life, and considerable thought has been given to possible signatures associated with weird forms of life on other planets. So far, however, very little attention has been paid to the possibility that our own planet might also host communities of weird life. If life arises readily in Earth-like conditions, as many astrobiologists contend, then it may well have formed many times on Earth itself, which raises the question whether one or more shadow biospheres have existed in the past or still exist today. In this paper, we discuss possible signatures of weird life and outline some simple strategies for seeking evidence of a shadow biosphere.

The challenges of sequencing by synthesis
Fuller, CW; Middendorf, LR; Benner, SA; Church, GM; Harris, T; Huang, XH; Jovanovich, SB; Nelson, JR; Schloss, JA; Schwartz, DC; Vezenov, DV
Nat. Biotechnol. 27 (11) 1013-1023 (2009)
<Abstract>

DNA sequencing-by-synthesis (SBS) technology, using a polymerase or ligase enzyme as its core biochemistry, has already been incorporated in several second-generation DNA sequencing systems with significant performance. Notwithstanding the substantial success of these SBS platforms, challenges continue to limit the ability to reduce the cost of sequencing a human genome to $ 100,000 or less. Achieving dramatically reduced cost with enhanced throughput and quality will require the seamless integration of scientific and technological effort across disciplines within biochemistry, chemistry, physics and engineering. The challenges include sample preparation, surface chemistry, fluorescent labels, optimizing the enzyme-substrate system, optics, instrumentation, understanding tradeoffs of throughput versus accuracy, and read-length/phasing limitations. By framing these challenges in a manner accessible to a broad community of scientists and engineers, we hope to solicit input from the broader research community on means of accelerating the advancement of genome sequencing technology.

Design of a novel molecular beacon: modification of the stem with artificially genetic alphabet
Sheng, PP; Yang, ZY; Kim, YM; Wu, YR; Tan, WH; Benner, SA
Chem. Comm. (41) 5128-5130 (2008)
<Abstract>

A molecular beacon that incorporates components of an artificially expanded genetic information system (AEGIS) in its stem is shown not to be opened by unwanted stem invasion by adventitious standard DNA; this should improve the "darkness" of the beacon in real-world applications.

The planetary biology of ascorbate and uric acid and their relationship with the epidemic of obesity and cardiovascular disease
Johnson, RJ; Gaucher, EA; Sautin, YY; Henderson, GN; Angerhofer, AJ; Benner, SA
Medical Hypotheses 71 (1) 22-31 (2008)
<Abstract>

Humans have relatively low plasma ascorbate levels and high serum uric acid levels compared to most mammals due to the presence of genetic mutations in L-gulonotactone oxidase and uricase, respectively. We review the major hypotheses for why these mutations may have occurred. In particular, we suggest that both mutations may have provided a survival advantage to early primates by helping maintain blood pressure during periods of dietary change and environmental stress. We further propose that these mutations have the inadvertent disadvantage of increasing our risk for hypertension and cardiovascular disease in today's society characterized by Western diet and increasing physical inactivity. Finally, we suggest that a "planetary biology" approach in which genetic changes are analyzed in relation to their biological action and historical context may provide the ideal approach towards understanding the biology of the past, present and future. (c) 2008 Elsevier Ltd. All rights reserved.

Self-Avoiding Molecular Recognition Systems (SAMRS)
Hoshika, S; Chen, F; Leal, NA; Benner, SA
Nucleic Acids Symp. Ser. 52 (1) 129-130 (2008)

Incorporation of Multiple Sequential Pseudothymidines by DNA Polymerases and Their Impact on DNA Duplex Structure
Havemann, SA; Hoshika, S; Hutter, D; Benner, SA
Nuc. Nuc. Nuc. acids 27 (3) , Taylor & Francis Group 261-278 (2008)
<Abstract>

In this article, we focus on the synthesis of aryl C-glycosides via Heck coupling. It is organized based on the type of structures used in the assembly of the C-glycosides (also called C-nucleosides) with the following subsections: pyrimidine C-nucleosides, purine C-nucleosides, and monocyclic, bicyclic, and tetracyclic C-nucleosides. The reagents and conditions used for conducting the Heck coupling reactions are discussed. The subsequent conversion of the Heck products to the corresponding target molecules and the application of the target molecules are also described.

Synthesis of pyrophosphates for in vitro selection of catalytic RNA molecules
Kim, HJ; Kim, MJ; Karalkar, N; Hutter, D; Benner, SA
Nuc. Nuc. Nuc. acids 27 , Taylor & Francis Group 43-56 (2008)

Synthetic Biology for Improved Personalized Medicine
Benner, SA; Hoshika, S; Sukeda, M; Hutter, D; Leal, NA; Yang, ZY; Chen, F
Nucleic Acids Symp. Ser. 52 (1) 243-244 (2008) doi: 10.1093/nass/nrn123
<Abstract>

Tools to re-sequence the genomes of individual patients having well described medical histories is the first step required to connect genetic information to diagnosis, prognosis, and treatment. There is little doubt that in the future, genomics will influence the choice of therapies for individual patients based on their specific genetic inheritance, as well as the genetic defects that led to disease. Cost is the principle obstacle preventing the realization of this vision. Unless the interesting parts of a patient genome can be resequenced for less than $10,000 (as opposed to $100,000 or more), it will be difficult to start the discovery process that will enable this vision. While instrumentation and biology are important to reducing costs, the key element to cost-effective personalized genomic sequencing will be new chemical reagents that deliver capabilities that are not available from standard DNA. Scientists at the Foundation for Applied Molecular Evolution and the Westheimer Institute have developed several of these, which will be the topic of this talk.

Computational reconstruction of ancestral genomic regions from evolutionarily conserved gene clusters
Danchin, EGJ; Gaucher, EA; Pontarotti, P
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 139-150 (2007)

Experimental resurrection of ancient biomolecules: gene synthesis, heterologous protein expression, and functional assays
Gaucher, EA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 153-163 (2007)

Ancestral sequence reconstruction as a tool to understand natural history and guide synthetic biology: realizing and extending the vision of Zuckerkandl and Pauling
Gaucher, EA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 20-33 (2007)

A thermophilic last universal ancestor inferred from its estimated amino acid composition
Brooks, DJ; Gaucher, EA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 200-207 (2007)

The resurrection of ribonucleases from mammals: from ecology to medicine
Sassi, SO; Benner, SA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 208-224 (2007)

The early days of paleogenetics: connecting molecules to the planet
Benner, SA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 3-19 (2007)

Leishmania promastigotes activate PI3K/Akt signalling to confer host cell resistance to apoptosis
Ruhland, A; Leal, N; Kima, PE
Cell Microbiol. 9 (1) 84-96 (2007)
<Abstract>

Previous reports have shown that cells infected with promastigotes of some Leishmania species are resistant to the induction of apoptosis. This would suggest that either parasites elaborate factors that block signalling from apoptosis inducers or that parasites engage endogenous host signalling pathways that block apoptosis. To investigate the latter scenario, we determined whether Leishmania infection results in the activation of signalling pathways that have been shown to mediate resistance to apoptosis in other infection models. First, we showed that infection with the promastigote form of Leishmania major, Leishmania pifanoi and Leishmania amazonensis activates signalling through p38 mitogen-activated protein kinase (MAPK), NF kappa B and PI3K/Akt. Then we found that inhibition of signalling through the PI3K/Akt pathway with LY294002 and Akt IV inhibitor reversed resistance of infected bone marrow-derived macrophages and RAW 264.7 macrophages to potent inducers of apoptosis. Moreover, reduction of Akt levels with small interfering RNAs to Akt resulted in the inability of infected macrophages to resist apoptosis. Further evidence of the role of PI3K/Akt signalling in the promotion of cell survival by infected cells was obtained with the finding that Bad, which is a substrate of Akt, becomes phosphorylated during the course of infection. In contrast to the observations with PI3K/Akt signalling, inhibition of p38 MAPK signalling with SB202190 or NF kappa B signalling with wedelolactone had limited effect on parasite-induced resistance to apoptosis. We conclude that Leishmania promastigotes engage PI3K/Akt signalling, which confers to the infected cell, the capacity to resist death from activators of apoptosis.

PduL is an evolutionarily distinct phosphotransacylase involved in B-12-dependent 1,2-propanediol degradation by Salmonella enterica serovar typhimurium LT2
Liu, Y; Leal, NA; Sampson, EM; Johnson, CLV; Havemann, GD; Bobik, TA
J. Bacteriol. 189 (5) 1589-1596 (2007)
<Abstract>

Salmonella enterica degrades 1,2-propanediol (1,2-PD) in a coenzyme B-12-dependent manner. Previous enzymatic assays of crude cell extracts indicated that a phosphotransacylase (PTAC) was needed for this process, but the enzyme involved was not identified. Here, we show that the pduL gene encodes an evolutionarily distinct PTAC used for 1,2-PD degradation. Growth tests showed that pduL mutants were unable to ferment 1,2-PD and were also impaired for aerobic growth on this compound. Enzyme assays showed that cell extracts from a pduL mutant lacked measurable PTAC activity in a background that also carried a pta mutation (the pta gene was previously shown to encode a PTAC enzyme). Ectopic expression of pduL corrected the growth defects of a pta mutant. PduL fused to eight C-terminal histidine residues (PduL-His(8)) was purified, and its kinetic constants were determined: the V-max was 51.7 +/- 7.6 mu mol min(-1) mg(-1), and the K-m values for propionyl-PO42- and acetyl-PO42- were 0.61 and 0.97 mM, respectively. Sequence analyses showed that PduL is unrelated in amino acid sequence to known PTAC enzymes and that PduL homologues are distributed among at least 49 bacterial species but are absent from the Archaea and Eukarya.

In vivo expression of human ATP : cob(I)atamin adenosyltransferase (ATR) using recombinant adeno-associated virus (rAAV) serotypes 2 and 8
Erger, KE; Conlon, TJ; Leal, NA; Zori, R; Bobik, TA; Flotte, TR
J. Gene Med. 9 (6) 462-469 (2007)
<Abstract>

Background Methylmalonic aciduria (MMA) is an autosomal recessive disease with symptoms that include ketoacidosis, lethargy, recurrent vomiting, dehydration, respiratory distress, muscular hypotonia and death due to methylmalonic acid levels that are up to 1000-fold greater than normal. CblB MMA, a subset of the mutations leading to MMA, is caused by a deficiency in the enzyme cob(I)alamin adenosyltransferase (ATR). No animal model currently exists for this disease. ATR functions within the mitochondria matrix in the final conversion of cobalamin into coenzyme B-12, adenosylcobalamin (AdoCbl). AdoCbl is. a required coenzyme for the mitochondrial enzyme methylmalonyl-CoA mutase (MCM). Methods The human ATR cDNA was cloned into a recombinant adenoassociated virus (rAAV) vector and packaged into AAV 2 or 8 capsids and delivered by portal vein injection to C57/B16 mice at a dose of 1 x 10(10) and 1 x 10(11), particles. Eight weeks post-injection RNA, genomic DNA and protein were then extracted and analyzed. Results Using primer pairs specific to the cytomegalovirus (CMV) enhancer/chicken P-actin (CBAT) promoter within the rAAV vectors, genome copy numbers were found to be 0.03, 2.03 and 0.10 per cell in liver for the rAAV8 low dose, rAAV8 high dose and rAAV2 high dose, respectively. Western blotting performed on mitochondrial protein extracts demonstrated protein levels were comparable to control levels in the rAAV8 low dose and rAAV2 high dose animals and 3- to 5-fold higher than control levels were observed in high dose animals. Immunostaining demonstrated enhanced transduction efficiency of hepatocytes to over 40% in the rAAV8 high dose animals, compared to 9% and 5% transduction in rAAV2 high dose and rAAV8 low dose animals, respectively. Conclusions These data demonstrate the feasibility of efficient ATR gene transfer to the liver as a prelude to future gene therapy experiments. Copyright (C) 2007 John Wiley & Sons, Ltd.

The evolution of seminal ribonuclease: Pseudogene reactivation or multiple gene inactivation events?
Sassi, SO; Braun, EL; Benner, SA
Mol. Biol. Evol. 24 (4) 1012-1024 (2007)
<Abstract>

Two approaches, one novel, are applied to analyze the divergent evolution of ruminant seminal ribonucleases (RNases), paralogs of the well-known pancreatic RNases of mammals. Here, the goal was to identify periods of divergence of seminal RNase under functional constraints, periods of divergence as a pseudogene, and periods of divergence driven by positive selection pressures. The classical approach involves the analysis of nonsynonymous to synonymous replacements ratios (omega) for the branches of the seminal RNase evolutionary tree. The novel approach coupled these analyses with the mapping of substitutions on the folded structure of the protein. These analyses suggest that seminal RNase diverged during much of its history after divergence from pancreatic RNase as a functioning protein, followed by homoplastic inactivations to create pseudogenes in multiple ruminant lineages. Further, they are consistent with adaptive evolution only in the most recent episode leading to the gene in modern oxen. These conclusions contrast sharply with the view, cited widely in the literature, that seminal RNase decayed after its formation by gene duplication into an inactive pseudogene, whose lesions were repaired in a reactivation event. Further, the 2 approaches, omega estimation and mapping of replacements on the protein structure, were compared by examining their utility for establishing the functional status of the seminal RNase genes in 2 deer species. Hog and roe deer share common lesions, which strongly suggests that the gene was inactive in their last common ancestor. In this specific example, the crystallographic approach made the correct implication more strongly than the omega approach. Studies of this type should contribute to an integrated framework of tools to assign functional and nonfunctional episodes to recently created gene duplicates and to understand more broadly how gene duplication leads to the emergence of proteins with novel functions.

Nucleoside alpha-thiotriphosphates, polymerases and the exonuclease III analysis of oligonucleotides containing phosphorothioate linkages
Yang, ZY; Sismour, AM; Benner, SA
Nucl. Acids Res. 35 (9) 3118-3127 (2007)
<Abstract>

The use of DNA polymerases to incorporate phosphorothioate linkages into DNA, and the use of exonuclease III to determine where those linkages have been incorporated, are re- examined in this work. The results presented here show that exonuclease III degrades single- stranded DNA as a substrate and digests through phosphorothioate linkages having one absolute stereochemistry, assigned ( assuming inversion in the polymerase reaction) as S, but not the other absolute stereochemistry. This contrasts with a general view in the literature that exonuclease III favors double-stranded nucleic acid as a substrate and stops completely at phosphorothioate linkages. Furthermore, not all DNA polymerases appear to accept exclusively the ( R) stereoisomer of nucleoside alpha- thiotriphosphates [ and not the ( S) diastereomer], a conclusion inferred two decades ago by examination of five Family- A polymerases and a reverse transcriptase. This suggests that caution is appropriate when extrapolating the detailed behavior of one polymerase from the behaviors of other polymerases. Furthermore, these results provide constraints on how exonuclease III - thiotriphosphate - polymerase combinations can be used to analyze the behavior of the components of a synthetic biology.

Enzymatic incorporation of a third nucleobase pair
Yang, ZY; Sismour, AM; Sheng, PP; Puskar, NL; Benner, SA
Nucl. Acids Res. 35 (13) 4238-4249 (2007)
<Abstract>

DNA polymerases are identified that copy a nonstandard nucleotide pair joined by a hydrogen bonding pattern different from the patterns joining the dA:T and dG:dC pairs. 6-Amino-5-nitro3-(l'-p-D-2'-deoxyribofuranosyl)-2(1H)-pyridone (dZ) implements the non-standard 'small' donordonor-acceptor (pyDDA) hydrogen bonding pattern. 2-Amino-8-(1-beta-D-2'-deoxyribofuranosyl)imidazo[1,2-a]-1,3,5-triazin-4 (8H)-one [dP) implements the 'large' acceptor-acceptor-donor (puAAD) pattern. These nucleobases were designed to present electron density to the minor groove, density hypothesized to help determine specificity for polymerases. Consistent with this hypothesis, both dZTP and dPTP are accepted by many polymerases from both Families A and B. Further, the dZ:dP pair participates in PCR reactions catalyzed by Taq, Vent (exo(-)) and Deep Vent (exo-) polymerases, with 94.4%, 97.5% and 97.5%, respectively, retention per round. The dZ:dP pair appears to be lost principally via transition to a dC:dG pair. This is consistent with a mechanistic hypothesis that deprotonated dZ (presenting a pyDAA pattern) complements dG (presenting a puADD pattern), while protonated dC (presenting a pyDDA pattern) complements dP (presenting a puAAD pattern). This hypothesis, grounded in the Watson-Crick model for nucleobase pairing, was confirmed by studies of the pH-dependence of mismatching. The dZ:dP pair and these polymerases, should be useful in dynamic architectures for sequencing, molecular-, systems- and synthetic-biology.

The origin of proteins and nucleic acids
Ricardo, A; Benner, SA
Planets and Life: The Emerging Science of Astrobiology , ed. Woodruff T. Sullivan and John A. Baross , Cambridge University Press 154-173 (2007)

Alien biochemistries
Ward, PD; Benner, SA
Planets and Life: The Emerging Science of Astrobiology , ed. Woodruff T. Sullivan and John A. Baross , Cambridge University Press 537-544 (2007)

Inferred thermophily of the Last Universal Ancestor based on estimated amino acid composition
Brooks, DJ; Gaucher, EA
Ancestral Sequence Reconstruction , ed. David A. Liberles , Oxford University Press 200-207 (2007)
<Abstract>

The environmental temperature of the last universal ancestor (LUA) of all extant organisms is the subject of heated debate. Because the amino acid composition of proteins differs between mesophiles and thermophiles, the inferred amino acid composition of proteins in the LUA could be used to classify it as one or the other. We applied expectation maximization (EM) to estimate the amino acid composition of a set of thirty-one proteins in the LUA based on alignments of their modern day descendants, a phylogenetic tree relating those descendants and a model of evolution. Separate estimates of amino acid composition in LUA proteins were derived using modern day sequences of eight mesophilic species, eight thermophilic species or the sixteen species combined. We show that the relative mean Euclidean distance between the amino acid composition in one species and that of a set of mesophiles or thermophiles can be employed as a classifier with 100% accuracy. Applying this classifier to the estimated amino acid composition of the ancestral protein set in the LUA, we find it to be classified as a thermophile even when only the proteins of mesophilic species are used to derive the estimate. Based on the estimated amino acid composition of proteins in the LUA, we infer that it was a thermophile. We discuss our findings in the context of previous data pertaining to the OGT of the LUA, particularly the inferred G + C content of its rRNA. We conclude that the gathering evidence strongly supports a thermophilic LUA.

Molecular Paleoscience: Systems Biology from the Past
Benner, SA; Sassi, SO; Gaucher, EA
Adv. Enzymol. Relat. Areas Mol. Biol. 75 (2006)

Integrating protein structures and precomputed genealogies in the Magnum database: Examples with cellular retinoid binding proteins
Bradley, ME; Benner, SA
BMC Bioinformatics 7 89 (2006)
<Abstract>

Background: When accurate models for the divergent evolution of protein sequences are integrated with complementary biological information, such as folded protein structures, analyses of the combined data often lead to new hypotheses about molecular physiology. This represents an excellent example of how bioinformatics can be used to guide experimental research. However, progress in this direction has been slowed by the lack of a publicly available resource suitable for general use. Results: The precomputed Magnum database offers a solution to this problem for ca. 1,800 full-length protein families with at least one crystal structure. The Magnum deliverables include 1) multiple sequence alignments, 2) mapping of alignment sites to crystal structure sites, 3) phylogenetic trees, 4) inferred ancestral sequences at internal tree nodes, and 5) amino acid replacements along tree branches. Comprehensive evaluations revealed that the automated procedures used to construct Magnum produced accurate models of how proteins divergently evolve, or genealogies, and correctly integrated these with the structural data. To demonstrate Magnum's capabilities, we asked for amino acid replacements requiring three nucleotide substitutions, located at internal protein structure sites, and occurring on short phylogenetic tree branches. In the cellular retinoid binding protein family a site that potentially modulates ligand binding affinity was discovered. Recruitment of cellular retinol binding protein to function as a lens crystallin in the diurnal gecko afforded another opportunity to showcase the predictive value of a browsable database containing branch replacement patterns integrated with protein structures. Conclusion: We integrated two areas of protein science, evolution and structure, on a large scale and created a precomputed database, known as Magnum, which is the first freely available resource of its kind. Magnum provides evolutionary and structural bioinformatics resources that are useful for identifying experimentally testable hypotheses about the molecular basis of protein behaviors and functions, as illustrated with the examples from the cellular retinoid binding proteins.

Application of DETECTER, an Evolutionary Genomic Tool to Analyze Genetic Variation, to the Cystic Fibrosis Gene Family
Gaucher, EA; DeKee, DW; Benner, SA
BMC Genomics 7 44 (2006)
<Abstract>

Background: The medical community requires computational tools that distinguish genetic differences having phenotypic impact within the vast number of mutations that do not. Tools that do this will become increasingly important for those seeking to use human genome sequence data to predict disease, make prognoses, and customize therapy to individual patients.
Results: An approach, termed DETECTER, is proposed to identify sites in a protein sequence where amino acid replacements are likely to have a significant effect on phenotype, including causing genetic disease. This approach uses a model-dependent tool to estimate the normalized replacement rate at individual sites in a protein sequence, based on a history of those sites extracted from an evolutionary analysis of the corresponding protein family. This tool identifies sites that have higher-than-average, average, or lower- than-average rates of change in the lineage leading to the sequence in the population of interest. The rates are then combined with sequence data to determine the likelihoods that particular amino acids were present at individual sites in the evolutionary history of the gene family. These likelihoods are used to predict whether any specific amino acid replacements, if introduced at the site in a modern human population, would have a significant impact on fitness. The DETECTER tool is used to analyze the cystic fibrosis transmembrane conductance regulator (CFTR) gene family.
Conclusions: In this system, DETECTER retrodicts amino acid replacements associated with the cystic fibrosis disease with greater accuracy than alternative approaches. While this result validates this approach for this particular family of proteins only, the approach may be applicable to the analysis of polymorphisms generally, including SNPs in a human population.

The diverse biological functions of phosphatidylinositol transfer proteins in eukaryotes
Phillips, SE; Vincent, P; Rizzieri, KE; Schaaf, G; Bankaitis, VA; Gaucher, EA
Crit. Rev. Biochem. Mol. Biol. 41 (1) 21-49 (2006)
<Abstract>

Phosphatidylinositol/phosphatidylcholine transfer proteins (PITPs) remain largely functionally uncharacterized, despite the fact that they are highly conserved and are found in all eukaryotic cells thus far examined by biochemical or sequence analysis approaches. The available data indicate a role for PITPs in regulating specific interfaces between lipid-signaling and cellular function. In this regard, a role for PITPs in controlling specific membrane trafficking events is emerging as a common functional theme. However, the mechanisms by which PITPs regulate lipid-signaling and membrane-trafficking functions remain unresolved. Specific PITP dysfunctions are now linked to neurodegenerative and intestinal malabsorbtion diseases in mammals, to stress response and developmental regulation in higher plants, and to previously uncharacterized pathways for regulating membrane trafficking in yeast and higher eukaryotes, making it clear that PITPs are integral parts of a highly conserved signal transduction strategy in eukaryotes. Herein, we review recent progress in deciphering the biological functions of PITPs, and discuss some of the open questions that remain.

2-Hydroxymethylboronate as a Reagent To Detect Carbohydrates: Application to the Analysis of the Formose Reaction
Ricardo, A; Frye, F; Carrigan, MA; Tipton, JD; Powell, DH; Benner, SA
J. Org. Chem. 71 (25) 9503-9505 (2006)
<Abstract>

2-Hydroxymethylphenylboronate is described as a reagent that converts neutral 1,2-diols, as found in simple carbohydrates, into 1:1 anionic complexes that are easily detected by Fourier transform ion cyclotron resonance mass spectrometry. The value of this reagent was demonstrated through its application to analyze complex mixtures of carbohydrates formed in the formose process, often cited as a way that biologically significant carbohydrates might have been generated from formaldehyde under prebiotic conditions. Coupled with isotope studies, the reagent shows that the simplest autocatalytic cycle for the consumption of formaldehyde in this process cannot account for the bulk consumption of formaldehyde.

Dynamic assembly of primers on nucleic acid templates
Leal, NA; Sukeda, M; Benner, SA
Nucl. Acids Res. 34 4702-4710 (2006)
<Abstract>

A strategy is presented that uses dynamic equlibria to assemble in situ composite DNA polymerase primers, having lengths of 14 or 16 nt, from DNA fragments that are 6 or 8 nt in length. In this implementation, the fragments are transiently joined under conditions of dynamic equilibrium by an imine linker, which has a dissociation constant of 1 µM. If a polymerase is able to extend the composite, but not the fragments, it is possible to prime the synthesis of a target DNA molecule under conditions where two useful specificities are combined: (i) single nucleotide discrimination that is characteristic of short oligonucleotide duplexes (four to six nucleobase pairs in length), which effectively excludes single mismatches, and (ii) an overall specificity of priming that is characteristic of long (14 to 16mers) oligonucleotides, potentially unique within a genome. We report here the screening of a series of polymerases that combine an ability not to accept short primer fragments with an ability to accept the long composite primer held together by an unnatural imine linkage. Several polymerases were found that achieve this combination, permitting the implementation of the dynamic combinatorial chemical strategy.

Artificially expanded genetic information system: a new base pair with an alternative hydrogen bonding pattern
Yang, ZY; Hutter, D; Sheng, PP; Sismour, AM; Benner, SA
Nucl. Acids Res. 34 (21) 6095-6101 (2006)
<Abstract>

To support efforts to develop a 'synthetic biology' based on an artificially expanded genetic information system (AEGIS), we have developed a route to two components of a non-standard nucleobase pair, the pyrimidine analog 6-amino-5-nitro-3-(1'-beta-D-2'-deoxyribofuranosyl)-2(1H)-pyridone (dZ) and its Watson-Crick complement, the purine analog 2-amino-8-(1'-beta-D-2'-deoxyribofuranosyl)-imidazo[1,2-a]-1,3,5-triazin -4(8H)-one (dP). These implement the pyDDA:puAAD hydrogen bonding pattern (where 'py' indicates a pyrimidine analog and 'pu' indicates a purine analog, while A and D indicate the hydrogen bonding patterns of acceptor and donor groups presented to the complementary nucleobases, from the major to the minor groove). Also described is the synthesis of the triphosphates and protected phosphoramidites of these two nucleosides. We also describe the use of the protected phosphoramidites to synthesize DNA oligonucleotides containing these AEGIS components, verify the absence of epimerization of dZ in those oligonucleotides, and report some hybridization properties of the dZ:dP nucleobase pair, which is rather strong, and the ability of each to effectively discriminate against mismatches in short duplex DNA.

A review: Synthesis of aryl C-glycosides via the heck coupling reaction
Wellington, KW; Benner, SA
Nuc. Nuc. Nuc. acids 25 (12) , Taylor & Francis Group 1309-1333 (2006)
<Abstract>

Desorption/ionization on porous silicon mass spectrometry studies on pentose-borate complexes
Li, Q; Ricardo, A; Benner, SA; Winefordner, JD; Powell, DH
Anal. Chem. 77 (14) 4503-4508 (2005)
<Abstract>

Desorption/ionization on porous silicon mass spectrometry (DIOS-MS) was used to investigate the binding affinities between aldopentose isomers and boron. Boron has been recognized for its importance in pentose synthesis and stabilization in prebiotic conditions. Boron may also account for the fact that ribose, among other aldopentoses, is the favored building block in RNA synthesis. This research started with the detection of aldopentoses in the positive mode through cationization and the aldopentose-borate complexes in the negative mode. Then two competition schemes, one using a pentose structure analogue and the other using C-13-labeled ribose, were designed to compare the relative binding affinities of four aldopentoses (xylose, lyxose, arabinose, and ribose) to boron. Both approaches determined the binding preference to be ribose > lyxose > arabinose > xylose. This work illustrates the potential of DIOS-MS in the analyses of nonvolatile, small molecules in delicate chemical equilibria. Without externally introduced matrices, background signals are not a limiting factor. Furthermore, the possible dramatic change of pH associated with the matrix introduction, which may disturb the equilibria of interest, is avoided.

Synthetic biology
Sismour, AM; Benner, SA
Expert Opin. Biol. Ther. 5 (11) 1409-1414 (2005)
<Abstract>

Chemistry is a broadly powerful discipline in contemporary science because it has the ability to create new forms of the matter that it studies. By doing so, chemistry can test models that connect molecular structure to behaviour without having to rely on what nature has provided. This creation, known as synthesis', began to be applied to living systems in the 1980s as recombinant DNA technologies allowed biologists to deliberately change the molecular structure of the microbes that they studied, and automated chemical synthesis of DNA became widely available to support these activities. The impact of the information that has emerged has made biologists aware of a truism that has long been known in chemistry: synthesis drives discovery and understanding in ways that analysis cannot. Synthetic biology is now setting an ambitious goal: to recreate in artificial systems the emergent properties found in natural biology. By doing so, it is advancing our understanding of the molecular basis of genetics in ways that analysis alone cannot. More practically, it has yielded artificial genetic systems that improve the healthcare of some 400,000 Americans annually. Synthetic biology is now set to take the next step, to create artificial Darwinian systems by direct construction. Supported by the National Science Foundation as part of its Chemical Bonding program, this work cannot help but generate clarity in our understanding of how biological systems work.

A call for likelihood phylogenetics even when the process of sequence evolution is heterogeneous
Gaucher, EA; Miyamoto, MM
Mol. Phylogenet. Evol. 37 (3) 928-931 (2005)
<Abstract>

All methods of phylogenetic inference make assumptions about the underlying evolutionary process of their characters and it is these assumptions that determine their relative successes and failures in the estimation of the true phylogeny for a group. This dependency of phylogenetic accuracy and robustness on evolutionary assumptions has been most extensively studied for the classic case of Felsenstein (1978) and its four-taxon phylogeny with two long, unrelated, terminal branches interspersed with two short ones. Given this model phylogeny, "long branch attraction" can occur and thereby lead to the convergence of a phylogenetic method onto an incorrect tree with the two long and two short terminal branches directly connected rather than interspersed. The extent to which a particular phylogenetic method is susceptible to this problem depends on what assumptions it makes about the evolution of the characters and data themselves.

The use of thymidine analogs to improve the replication of an extra DNA base pair: a synthetic biological system
Sismour, AM; Benner, SA
Nucl. Acids Res. 33 5640-5646 (2005)
<Abstract>

Synthetic biology based on a six-letter genetic alphabet that includes the two non-standard nucleobases isoguanine (isoG) and isocytosine (isoC), as well as the standard A, T, G and C, is known to suffer as a consequence of a minor tautomeric form of isoguanine that pairs with thymine, and therefore leads to infidelity during repeated cycles of the PCR. Reported here is a solution to this problem. The solution replaces thymidine triphosphate by 2-thiothymidine triphosphate (2-thioTTP). Because of the bulk and hydrogen bonding properties of the thione unit in 2-thioT, 2-thioT does not mispair effectively with the minor tautomer of isoG. To test whether this might allow PCR amplification of a six-letter artificially expanded genetic information system, we examined the relative rates of misincorporation of 2-thioTTP and TTP opposite isoG using affinity electrophoresis. The concentrations of isoCTP and 2-thioTTP were optimal to best support PCR amplification using thermostable polymerases of a six-letter alphabet that includes the isoC-isoG pair. The fidelity-per-round of amplification was found to be approximately 98% in trial PCRs with this six-letter DNA alphabet. The analogous PCR employing TTP had a fidelity-per-round of only approximately 93%. Thus, the A, 2-thioT, G, C, isoC, isoG alphabet is an artificial genetic system capable of Darwinian evolution.

Resurrecting ancestral alcohol dehydrogenases from yeast
Thomson, JM; Gaucher, EA; Burgan, MF; De Kee, DW; Li, T; Aris, JP; Benner, SA
Nature Genet. 37 (6) 630-635 (2005)
<Abstract>

Modern yeast living in fleshy fruits rapidly convert sugars into bult ethanol through pyruvate. Pyruvate loses carbon dioxide to become acetaldehyde, which is reduced by alcohol dehydrogenase 1 (Adh1) to ethanol, which accumulates. Yeast later consumes the accumulated ethanol, exploiting Adh2, an Adh1 homolog differing by 24 (of 348) amino acids. Because many microorganisms cannot grow in ethanol, accumulated ethanol may help yeast defend resources in the fruit. We report here the reconstruction of the last common ancestor of Adh1 and Adh2, called AdhA. The kinetic behavior of AdhA suggests that it was optimized to make (not consume) ethanol. This is consistent with the hypothesis that before the Adh1-Adh2 duplication, yeast did not accumulate ethanol for later consumption but rather used AdhA to recycle NADH generated in the glycolytic pathway. Silent nucleotide dating suggests that the Adh1-Adh2 duplication occurred near the time of duplication of several other proteins involved in the accumulation of ethanol, possibly in the Cretaceous age when fleshy fruits arose. These results help to connect the chemical behavior of these enzymes through systems analysis to a time of global ecosystem change, a small but useful step towards a planetary systems biology.

Synthetic Biology
Sismour, AM; Benner, SA
Nat. Rev. Genet. 6 533-543 (2005)
<Abstract>

Synthetic biologists come in two broad classes. One uses unnatural molecules to reproduce emergent behaviours from natural biology, with the goal of creating artificial life. The other seeks interchangeable parts from natural biology to assemble into systems that function unnaturally. Either way, a synthetic goal forces scientists to cross uncharted ground to encounter and solve problems that are not easily encountered through analysis. This drives the emergence of new paradigms in ways that analysis cannot easily do. Synthetic biology has generated diagnostic tools that improve the care of patients with infectious diseases, as well as devices that oscillate, creep and play tic-tac-toe.

Cytoplasmic glycosylation of protein-hydroxyproline and its relationship to other glycosylation pathways
West, CM; van der Wel, H; Sassi, S; Gaucher, EA
Biochim. Biophys. Acta 1673 29-44 (2004)
<Abstract>

The Skp1 protein, best known as a subunit of E3(SCF)-ubiquitin ligases, is subject to complex glycosylation in the cytoplasm of the cellular slime mold Dictyostelium. Pro143 of this protein is sequentially modified by a prolyl hydroxylase and five soluble glycosyltransferases (GT), to yield the structure Galalpha1,Galalpha1,3Fucalpha1,2Galbeta1,3GlcNAcalpha1-HyPro143. These enzymes are unusual in that they are expressed in the cytoplasmic compartment of the cell, rather than the secretory pathway where complex glycosylation of proteins usually occurs. The first enzyme in the pathway appears to be related to the soluble animal prolyl 4-hydroxylases (P4H), which modify the transcriptional factor subunit HIF-1alpha in the cytoplasm, and more distantly to the P4Hs that modify collagen and other proteins in the rER, based on biochemical and informatics analyses. The soluble alphaGlcNAc-transferase acting on Skp1 has been cloned and is distantly related to the mucin-type polypeptide N-acetyl-alpha-galactosaminyltransferase in the Golgi of animals. Its characterization has led to the discovery of a family of related polypeptide N-acetyl-alpha-glucosaminyltransferases in the Golgi of selected lower eukaryotes. The Skp1 GlcNAc is extended by a bifunctional diglycosyltransferase that sequentially and apparently processively adds beta1,3Gal and alpha1,2Fuc. Though this structure is also formed in the animal secretory pathway, the GTs involved are dissimilar. Conceptual translation of available genomes suggests the existence of this kind of complex cytoplasmic glycosylation in other eukaryotic microorganisms, including diatoms, oomycetes, and possibly Chlamydomonas and Toxoplasma, and an evolutionary precursor of this pathway may also occur in prokaryotes. (C) 2004 Elsevier B.V. All rights reserved.

The planetary biology of cytochrome P450 aromatases
Gaucher, EA; Graddy, LG; Li, T; Simmen, RC; Simmen, FA; Schreiber, DR; Liberles, DA; Janis, CM; Benner, SA
BMC Biology 2 (1) 19 (2004)
<Abstract>

BACKGROUND: Joining a model for the molecular evolution of a protein family to the paleontological and geological records (geobiology), and then to the chemical structures of substrates, products, and protein folds, is emerging as a broad strategy for generating hypotheses concerning function in a post-genomic world. This strategy expands systems biology to a planetary context, necessary for a notion of fitness to underlie (as it must) any discussion of function within a biomolecular system.
RESULTS: Here, we report an example of such an expansion, where tools from planetary biology were used to analyze three genes from the pig Sus scrofa that encode cytochrome P450 aromatases-enzymes that convert androgens into estrogens. The evolutionary history of the vertebrate aromatase gene family was reconstructed. Transition redundant exchange silent substitution metrics were used to interpolate dates for the divergence of family members, the paleontological record was consulted to identify changes in physiology that correlated in time with the change in molecular behavior, and new aromatase sequences from peccary were obtained. Metrics that detect changing function in proteins were then applied, including KA/KS values and those that exploit structural biology. These identified specific amino acid replacements that were associated with changing substrate and product specificity during the time of presumed adaptive change. The combined analysis suggests that aromatase paralogs arose in pigs as a result of selection for Suoidea with larger litters than their ancestors, and permitted the Suoidea to survive the global climatic trauma that began in the Eocene.
CONCLUSIONS: This combination of bioinformatics analysis, molecular evolution, paleontology, cladistics, global climatology, structural biology, and organic chemistry serves as a paradigm in planetary biology. As the geological, paleontological, and genomic records improve, this approach should become widely useful to make systems biology statements about high-level function for biomolecular systems.

Significance of cytoplasmic prolyl hydroxylation and complex glycosylation in the cellular slime mold Dictyostelium
West, CM; van der Wel, H; Sassi, S; Gaucher, E; Ercan, A
Glycobiology 14 (11) 1063-1063 (2004)

Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments
Chang, MSS; Benner, SA
J. Mol. Biol. 341 (2) 617-631 (2004)
<Abstract>

To understand how protein segments are inserted and deleted during divergent evolution, a set of pairwise alignments contained exactly one gap, and therefore arising from the first insertion-deletion (indel) event in the time separating the homologs, was examined. The alignments showed that "structure breaking" amino acids (PGDNS) were preferred within and flanking gapped regions, as are two residues with hydrophilic side-chains (QE) that frequently occur at the surface of protein folds. Conversely, hydrophobic residues (FMILYVW) occur infrequently within and flanking the gapped region. These preferences are modestly different in protein pairs separated by an episode of adaptive evolution, than in pairs diverging under strong functional constraints. Surprisingly, regions near an indel have not evolved more rapidly than the sequence pair overall, showing no evidence that an indel event must be compensated by local amino acid replacement. The gap-lengths are best approximated by a Zipfian distribution, with the probability of a gap of length L decreasing as a function of L-1.8. These features are largely independent of the length of the gap and the extent of divergence (measured by both silent and non-silent sequence changes) separating the two proteins. Surprisingly, amino acid repeats were discovered in more than a third of the polypeptide segments in and around the gap. These correspond to repeats in the DNA sequence. This suggests that a signature of the mechanism by which indels occur in the DNA sequence remains in the encoded protein sequences. These data suggest specific tools to score gap placement in an alignment. They also suggest tools that distinguish true indels from gaps created by mistaken gene finding, including under-predicted and overpredicted introns. By providing mechanisms to identify errors, the tools will enhance the value of genome sequence databases in support of integrated paleogenomics strategies used to extract functional information in a post-genomic environment.

Evolutionary, structural and biochemical evidence for a new interaction site of the leptin obesity protein
Gaucher, EA; Miyamoto, MM; Benner, SA
Genetics 163 (4) 1549-1553 (2003)
<Abstract>

The Leptin protein is central to the regulation of energy metabolism in mammals. By integrating evolutionary, structural, and biochemical information, a surface segment, outside of its known receptor contacts, is predicted as a second interaction site that may help to further define its roles in energy balance and its functional differences between humans and other mammals.

Initiation of mucin-type O-glycosylation in lower eukaryotes (O-alpha-GlcNAc-type) and higher eukaryotes (O-alpha-GalNAc-type) is homologous
West, CM; Wang, F; van der Wel, H; Gaucher, E; Sassi, S; Metcalf, T; Heise, N; Mendonca-Previato, L; Previato, JO
Glycobiology 13 (11) 875-876 (2003)

Inferring the palaeoenvironment of ancient bacteria on the basis of resurrected proteins
Gaucher, EA; Thomson, JM; Burgan, MF; Benner, SA
Nature 425 (6955) 285-288 (2003)
<Abstract>

Features of the physical environment surrounding an ancestral organism can be inferred by reconstructing sequences(1-9) of ancient proteins made by those organisms, resurrecting these proteins in the laboratory, and measuring their properties. Here, we resurrect candidate sequences for elongation factors of the Tu family (EF-Tu) found at ancient nodes in the bacterial evolutionary tree, and measure their activities as a function of temperature. The ancient EF-Tu proteins have temperature optima of 55-65degreesC. This value seems to be robust with respect to uncertainties in the ancestral reconstruction. This suggests that the ancient bacteria that hosted these particular genes were thermophiles, and neither hyperthermophiles nor mesophiles. This conclusion can be compared and contrasted with inferences drawn from an analysis of the lengths of branches in trees joining proteins from contemporary bacteria(10), the distribution of thermophily in derived bacterial lineages(11), the inferred G+C content of ancient ribosomal RNA(12), and the geological record combined with assumptions concerning molecular clocks(13). The study illustrates the use of experimental palaeobiochemistry and assumptions about deep phylogenetic relationships between bacteria to explore the character of ancient life.

Complex glycosylation of Skp1 in Dictyostelium: implications for the modification of other eukaryotic cytoplasmic and nuclear proteins
West, CM; van der Wel, H; Gaucher, EA
Glycobiology 12 (2) (2002)
<Abstract>

Recently, complex O-glycosylation of the cytoplasmic/nuclear protein Skp1 has been characterized in the eukaryotic microorganism Dirtyostelium. Skp1's glycosylation is mediated by the sequential action of a prolyl hydroxylase and five conventional sugar nucleotide-dependent glycosyltransferase activities that reside in the cytoplasm rather than the secretory compartment. The Skp1-HyPro GlcNAc-Transferase, which adds the first sugar, appears to be related to a lineage of enzymes that originated in the prokaryotic cytoplasm and initiates mucin-type O-linked glycosylation in the lumen of the eukaryotic Golgi apparatus. GlcNAc is extended by a bifunctional glycosyltransferase that mediates the ordered addition of beta1,3-linked Gal and alpha1,2-linked Fuc. The architecture of this enzyme resembles that of certain two-domain prokaryotic glycosyl-transferases. The catalytic domains are related to those of a large family of prokaryotic and eukaryotic, cytoplasmic, membrane-bound, inverting glycosyltransferases that modify glycolipids and polysaccharides prior to their translocation across membranes toward the secretory pathway or the cell exterior. The existence of these enzymes in the eukaryotic cytoplasm away from membranes and their ability to modify protein acceptors expose a new set of cytoplasmic and nuclear proteins to potential prolyl bydroxylation and complex O-linked glycosylation.

Identification of a Golgi-associated UDP-GlcNAc : polypeptide mucin-type alpha-N-acetylglucosaminyltransferase that modifies cell surface proteins in Dictyostelium
West, CM; van der Wel, H; Metcalf, T; Kaplan, L; Gaucher, EA
Glycobiology 12 (10) 697-697 (2002)

The crystal structure of eEF1A refines the functional predictions of an evolutionary analysis of rate changes among elongation factors
Gaucher, EA; Das, UK; Miyamoto, MM; Benner, SA
Mol. Biol. Evol. 19 (4) 569-573 (2002)

Evolution - Planetary biology - Paleontological, geological, and molecular histories of life
Benner, SA; Caraco, MD; Thomson, JM; Gaucher, EA
Science 296 (5569) 864-868 (2002)
<Abstract>

The history of life on Earth is chronicled in the geological strata, the fossil record, and the genomes of contemporary organisms. When examined together, these records help identify metabolic and regulatory pathways, annotate protein sequences, and identify animal models to develop new drugs, among other features of scientific and biomedical interest. Together, planetary analysis of genome and proteome databases is providing an enhanced understanding of how life interacts with the biosphere and adapts to global change.

Predicting functional divergence in protein evolution by site-specific rate shifts
Gaucher, EA; Gu, X; Miyamoto, MM; Benner, SA
Trends Biochem. Sci. 27 (6) 315-321 (2002)
<Abstract>

Most modern tools that analyze protein evolution allow individual sites to mutate at constant rates over the history of the protein family. However, Walter Fitch observed in the 1970s that, if a protein changes its function, the mutability of individual sites might also change. This observation is captured in the 'non-homogeneous gamma model', which extracts functional information from gene families by examining the different rates at which individual sites evolve. This model has recently been coupled with structural and molecular biology to identify sites that are likely to be involved in changing function within the gene family. Applying this to multiple gene families highlights the widespread divergence of functional behavior among proteins to generate paralogs and orthologs.

A bifunctional diglycosyltransferase forms the Fuca1,2Galb,3-disaccharide on Skp1 in the cytoplasm of Dictyostelium
van der Wel, H; Fisher, SZ; Gaucher, EA; West, CM
Glycobiology 11 (10) 884-884 (2001)

Function-structure analysis of proteins using covarion-based evolutionary approaches: Elongation factors
Gaucher, EA; Miyamoto, MM; Benner, SA
Proc. Natl. Acad. Sci. USA 98 (2) 548-552 (2001)
<Abstract>

The divergent evolution of protein sequences from genomic databases can be analyzed by the use of different mathematical models. The most common treat all sites in a protein sequence as equally variable. More sophisticated models acknowledge the fact that purifying selection generally tolerates variable amounts of amino acid replacement at different positions in a protein sequence. In their "stationary" versions, such models assume that the replacement rate at individual positions remains constant throughout evolutionary history. "Nonstationary" covarion versions, however, allow the replacement rate at a position to vary in different branches of the evolutionary tree. Recently, statistical methods have been developed that highlight this type of variation in replacement rates. Here, we show how positions that have variable rates of divergence in different regions of a tree ("covarion behavior"), coupled with analyses of experimental three-dimensional structures, can provide experimentally testable hypotheses that relate individual amino acid residues to specific functional differences in those branches. We illustrate this in the elongation factor family of proteins as a paradigm for applications of this type of analysis in functional genomics generally.

Evolution, language and analogy in functional genomics
Benner, SA; Gaucher, EA
Trends in Genetics 17 (7) 414-418 (2001)
<Abstract>

Almost a century ago, Wittgenstein pointed out that theory in science is intricately connected to language. This connection is not a frequent topic in the genomics literature. But a case can be made that functional genomics is today hindered by the paradoxes that Wittgenstein identified. If this is true, until these paradoxes are recognized and addressed, functional genomics will continue to be limited in its ability to extrapolate information from genomic sequences.

Functional inferences from reconstructed evolutionary biology involving rectified databases. An evolutionarily-grounded approach to functional genomics.
Benner, SA; Chamberlin, SG; Liberles, DA; Govindarajan, S; Knecht, L
Res. MicroBiol. 151 (2) 97-106 (2000)
<Abstract>

If bioinformatics tools are constructed to reproduce the natural, evolutionary history of the biosphere, they offer powerful approaches to some of the most difficult tasks in genomics, including the organization and retrieval of sequence data, the updating of massive genomic databases, the detection of database error, the assignment of introns, the prediction of protein conformation from protein sequences, the detection of distant homologs, the assignment of function to open reading frames, the identification of biochemical pathways from genomic data, and the construction of a comprehensive model correlating the history of biomolecules with the history of planet Earth.

Chance and Necessity in Biomolecular Chemistry - Is Life as We Know It Universal?
Benner, SA; Switzer, CY
Simplicity and Complexity in Proteins and Nucleic Acids , ed. H. Frauenfelder, J. Deisenhofer, P.G. Wolynes , Dahlem University Press 339-363 (1999)

How small can a microorganism be?
Benner, SA
Size Limits of Very Small Microorganisms: Proceedings of a Workshop, Steering Group on Astrobiology of the Space Studies Board , National Research Council 126-135 (1999)