Recent research
has pointed to fundamental ties between the emergence of conflict and the
fixation of diverse polymerase activities from distinct protein folds (see earlier post). One byproduct of the emergence of several stable
polymerase-based replicative systems, which depend on nucleotide transfer
during nucleic acid polymer elongation, was a concomitant increase in the
diversity of homolog nucleotidyltransferases (NTase) which function in the
production of cyclic or oligonucleotides. The appearance of these enzymes in
biological conflict contexts likely led to their selection as core components
of pathways involved in conflict, with their nucleotide products contributing
as signaling molecules during conflict and environmental stress response.
A recent
study from our group identified a wealth of previously-unidentified systems,
distributed widely across a broad swath of prokaryotic phyla, centering on just
such secondary messenger nucleotide-generating NTase enzymes and their corresponding
nucleotide sensor domains (click to read). In addition to this NTase-sensor
pair, these conflict systems invariably contain an effector domain which is
predicted to either attack a non-self-entity or initiate cell suicide. The most
frequently-observed NTase embedded in these systems is a representative of the
SMODS family, which is typically coupled to one of two novel sensor domains,
the SAVED or AGS-C domains.
The SMODS family belongs to the DNA polβ fold,
which includes the experimentally-characterized Vibrio cholerae DncV protein which is known to generate cyclic
GMP-AMP (cGAMP). Within the DNA polβ fold, the SMODS family forms a
higher-order assemblage with both the eukaryotic cGAS (cGAMP synthase) and OAS
(2’-5’ oligoadenylate synthase) enzymes. While the SMODS NTase was likely the “founder”
NTase for these newly-identified, nucleotide-dependent conflict systems,
several systems display clear evidence of displacement of the core
nucleotide-generating NTase component.
In a subset
of systems, we identified a previously-unidentified enzyme occupying the
typical SMODS position, suggesting a displacement by a novel, uncharacterized
NTase domain. Careful analysis of this domain identified it as a new member of
the RRM-like fold containing a “palm” domain, with a surprising, close
relationship to the catalytic domain of the CRISPR polymerases (frequently
referred to as Cmr2 or Cas10). As this novel enzyme conserved the structure and
sequence features necessary for NTase activity, yet lacked the N-terminal
fusion to the HD phosphoesterase domain observed in the CRISPR polymerase
domains, we named it the mCpol (minimal CRISPR polymerase) domain (click to read).
Both the CRISPR polymerase NTase domain and the mCpol together form a
higher-order assemblage of palm domain NTases to the exclusion of all other
families with the GGDEF family of cyclic di-GMP synthetases.
Strikingly,
mCpol domains were consistently linked in nucleotide-dependent conflict system
contexts to the CARF sensor domain. This represents a further parallel between
the CRISPR polymerase and mCpol domains; previous research from our group has
described enrichment of CARF domains specifically in the so-called “Type III”
CRISPR systems which harbor active CRISPR polymerase and HD domain fusion
proteins (click to read). mcPol- and
CRISPR polymerase-centered systems can thus each be conceptually thought of as
containing three core components: 1) the nucleotide synthetase component, 2)
the CARF sensor component, and 3) effector domain components. In mCpol systems,
effectors include various pore-forming domains and the HEPN RNase domain. In
CRISPR systems, the effector takes the form of the HEPN RNase domain found C-terminally
fused to the CARF domain, and might also thematically extend to include other
CRISPR effectors
including interference caused by the cascade complexes.
The
discovery of the mCpol domain and its placement within a larger context of
nucleotide-dependent conflict systems therefore offers substantive insight into
the evolution and function of CRISPR systems containing the CRISPR polymerase.
In evolutionary terms, it appears likely that certain CRISPR systems, including
the classical Type I and III systems, emerged through combination of the more
minimal mCpol-CARF (and potentially HEPN) units with other mobile elements
including the RAMPs and Cas1-Cas2 dyad.
In
functional terms, the CRISPR polymerase itself has long-remained an enigmatic
domain with regards to its possible role in the CRISPR systems, with
speculation at various points ranging from roles as a crRNA-amplifying
polymerase, template independent terminal transferase, and cyclic nucleotide
synthetase. Based on its relationship to the GGDEF synthetases and now with the
evolutionary parallels to our newly-described, nucleotide
intermediate-dependent conflict systems, we can say that these polymerases are
likely generating (cyclic) nucleotides which are in turn sensed by accompanying
CARF domains. Again by parallel to the many other described
nucleotide-dependent conflict systems (click to read), this nucleotide signal
is likely to activate the HEPN effector in these CRISPR systems. In light of
this, the HD-phosphoesterase domain N-terminally fused to the CRISPR polymerase
domain could provide a means of terminating this effector-activating signal by
hydrolyzing the nucleotide, an action comparable to the cNMP phosphodiesterases
with HD domains in classical cNMP signaling.
The
discovery of these evolutionary connections and their resulting functional
inferences will undoubtedly deepen experimental understanding of the endogenous
regulation of different classes of CRISPR systems. Additionally, there is potential
scope for these discoveries to bring improvements to biotechnological
application of CRISPR systems in the lab.