Back in the early days of computers you could own (aka personal computers) it wasn’t point and click, but hunt and peck, where commands in the early operating systems (DOS, etc.) had to be typed onto the command line using a keyboard. The interfaces were far from intuitive, to say the least, and the unexpected was always expected. When things went south software designers quickly learned to say “That’s not a bug, thats a feature ! ”
Essentially the same thing has happened to the latest and greatest tool in genetic engineering, the CRISPR system. It’s fascinating that it has been hiding in plain sight for FOUR decades. In med school in the mid60s the basic book about hereditary and DNA was “Sexuality and the Genetics of Bacteria” (1961) by Francois Jacob. No one had any idea that DNA would be sequenced. Viruses were studied (called bacteriophages back then).
No one had any idea that bacteria could defend themselves against viruses, but defend they do by their CRISPR system. It’s only been known for a decade, earlier papers on the subject by 3 different authors Mojica, Gilles Vergnaud, Alexander Bolotin were rejected before eventual publication.
Briefly, when a bacterium is infected by a virus, it makes a copy of fragments of its DNA, and pastes it into its genome. On subsequent invasions, it uses the DNA copy to make RNA, which along with a complex enzyme binds to the genome of the new organism, and destroys it.
It turns out that a PAM (Protospacer Adjacent Motif) is crucial for the whole system to work. The bacterial DNA doesn’t have such a sequence of DNA, and searches for it in the invader. The PAM isn’t large (just 3 nucleotides in a row) and the system looks for it in invading viral DNA double helices.
But where does it look? On the side of the double helix with the least information — the minor groove
Look at the following http://pharmafactz.com/wp/wp-content/uploads/2014/11/watson-crick-base-pairing.jpg
It shows classic Watson Crick base pairing — the major groove is a lot bigger taking up 210 degrees (hardly a groove) with more chemical information) than the minor groove. So binding to the major groove is likely to be far more accurate (as well as easier because it’s a larger space)
So why does E. Coli do this? Because different viruses contain different PAM sequences. [ Nature vol. 530 pp. 499 – 503 ’16 ] This is the crystal structure of the E. Coli Cascade complex (the business end of CRISPR) bound to a foreign double stranded DNA target. The 5′ ATG PAM is recognized in duplex form, from the minor groove side, by 3 structural features in the Cse1 subunit of cascade. The promiscuity inherent to minor groove DNA recognition explains how a single Cascade complex can respond to several distinct PAM sequences — this is a feature not a bug.