Research Highlights of the Aravind group: Origins of cyclic- and oligo-nucleotides in biological conflicts from microbes to vertebrate interferons: The “CRISPR polymerase” finds a function

Recent research has pointed to fundamental ties between the emergence of conflict and the fixation of diverse polymerase activities from distinct protein folds (see earlier post). One byproduct of the emergence of several stable polymerase-based replicative systems, which depend on nucleotide transfer during nucleic acid polymer elongation, was a concomitant increase in the diversity of homolog nucleotidyltransferases (NTase) which function in the production of cyclic or oligonucleotides. The appearance of these enzymes in biological conflict contexts likely led to their selection as core components of pathways involved in conflict, with their nucleotide products contributing as signaling molecules during conflict and environmental stress response.

A recent study from our group identified a wealth of previously-unidentified systems, distributed widely across a broad swath of prokaryotic phyla, centering on just such secondary messenger nucleotide-generating NTase enzymes and their corresponding nucleotide sensor domains (click to read). In addition to this NTase-sensor pair, these conflict systems invariably contain an effector domain which is predicted to either attack a non-self-entity or initiate cell suicide. The most frequently-observed NTase embedded in these systems is a representative of the SMODS family, which is typically coupled to one of two novel sensor domains, the SAVED or AGS-C domains.

The SMODS family belongs to the DNA polβ fold, which includes the experimentally-characterized Vibrio cholerae DncV protein which is known to generate cyclic GMP-AMP (cGAMP). Within the DNA polβ fold, the SMODS family forms a higher-order assemblage with both the eukaryotic cGAS (cGAMP synthase) and OAS (2’-5’ oligoadenylate synthase) enzymes. While the SMODS NTase was likely the “founder” NTase for these newly-identified, nucleotide-dependent conflict systems, several systems display clear evidence of displacement of the core nucleotide-generating NTase component.

In a subset of systems, we identified a previously-unidentified enzyme occupying the typical SMODS position, suggesting a displacement by a novel, uncharacterized NTase domain. Careful analysis of this domain identified it as a new member of the RRM-like fold containing a “palm” domain, with a surprising, close relationship to the catalytic domain of the CRISPR polymerases (frequently referred to as Cmr2 or Cas10). As this novel enzyme conserved the structure and sequence features necessary for NTase activity, yet lacked the N-terminal fusion to the HD phosphoesterase domain observed in the CRISPR polymerase domains, we named it the mCpol (minimal CRISPR polymerase) domain (click to read). Both the CRISPR polymerase NTase domain and the mCpol together form a higher-order assemblage of palm domain NTases to the exclusion of all other families with the GGDEF family of cyclic di-GMP synthetases.

Strikingly, mCpol domains were consistently linked in nucleotide-dependent conflict system contexts to the CARF sensor domain. This represents a further parallel between the CRISPR polymerase and mCpol domains; previous research from our group has described enrichment of CARF domains specifically in the so-called “Type III” CRISPR systems which harbor active CRISPR polymerase and HD domain fusion proteins (click to read). mcPol- and CRISPR polymerase-centered systems can thus each be conceptually thought of as containing three core components: 1) the nucleotide synthetase component, 2) the CARF sensor component, and 3) effector domain components. In mCpol systems, effectors include various pore-forming domains and the HEPN RNase domain. In CRISPR systems, the effector takes the form of the HEPN RNase domain found C-terminally fused to the CARF domain, and might also thematically extend to include other CRISPR effectors

including interference caused by the cascade complexes.

The discovery of the mCpol domain and its placement within a larger context of nucleotide-dependent conflict systems therefore offers substantive insight into the evolution and function of CRISPR systems containing the CRISPR polymerase. In evolutionary terms, it appears likely that certain CRISPR systems, including the classical Type I and III systems, emerged through combination of the more minimal mCpol-CARF (and potentially HEPN) units with other mobile elements including the RAMPs and Cas1-Cas2 dyad.

In functional terms, the CRISPR polymerase itself has long-remained an enigmatic domain with regards to its possible role in the CRISPR systems, with speculation at various points ranging from roles as a crRNA-amplifying polymerase, template independent terminal transferase, and cyclic nucleotide synthetase. Based on its relationship to the GGDEF synthetases and now with the evolutionary parallels to our newly-described, nucleotide intermediate-dependent conflict systems, we can say that these polymerases are likely generating (cyclic) nucleotides which are in turn sensed by accompanying CARF domains. Again by parallel to the many other described nucleotide-dependent conflict systems (click to read), this nucleotide signal is likely to activate the HEPN effector in these CRISPR systems. In light of this, the HD-phosphoesterase domain N-terminally fused to the CRISPR polymerase domain could provide a means of terminating this effector-activating signal by hydrolyzing the nucleotide, an action comparable to the cNMP phosphodiesterases with HD domains in classical cNMP signaling.

The discovery of these evolutionary connections and their resulting functional inferences will undoubtedly deepen experimental understanding of the endogenous regulation of different classes of CRISPR systems. Additionally, there is potential scope for these discoveries to bring improvements to biotechnological application of CRISPR systems in the lab.

Wednesday, December 7, 2016

Origins of cyclic- and oligo-nucleotides in biological conflicts from microbes to vertebrate interferons: The “CRISPR polymerase” finds a function