We have applied NMR and molecular dynamics computations including intensity based refinement to define the structure of the d(G-G-G-C-T4-G-G-G-C) dodecanucleotide in 100 mM NaCl solution. The G-G-G-C sequence is of interest since it has been found as tandem repeats in the DNA sequence of human chromosome 19. The same G-G-G-C sequence is also seen as islands in adeno-associated virus, a human parvovirus, which is unique amongst eukaryotic DNA viruses in its ability to integrate site-specifically into a defined region of human chromosome 19. The d(G-G-G-C-T4-G-G-G-C) sequence forms a quadruplex in Na cation containing solution through head-to-tail dimerization of two symmetry-related stem-hairpin loops with adjacent strands antiparallel to each other around the quadruplex. The connecting T4 loops are of the lateral type, resulting in a quadruplex structure containing two Internal G·G·G·G· tetrads flanked by G·C·G·C· tetrads. The G(anti)·G(syn)·G(anti)·G(syn) tetrads are formed through dimerization associated hydrogen bonding alignments of a pair of Hoogsteen G(anti)·G(syn) mismatch pairs, while the G(anti)·C(anti)·G(anti)·C(anti) tetrads are formed through dimerization associated bifurcated hydrogen bonding alignments involving the major groove edges of a pair of Watson-Crick G·C base-pairs. The quadruplex contains two distinct narrow and two symmetric wide grooves with extensive stacking between adjacent tetrad planes. The structure of the quadruplex contains internal cavities that can potentially accommodate Na cations positioned between adjacent tetrad planes. Three such Na cations have been modeled into the structure of the d(G-G-G-C-T4-G-G-G-C) quadruplex. Finally, we speculate on the potential role of quadruplex formation involving G·G·G·G· and G·C·G·C· tetrads during the integration of the adeno-associated parvovirus into its target on human chromosome 19, both of which involve stretches of G-G-G-C sequence elements.
All Science Journal Classification (ASJC) codes
- Structural Biology
- Molecular Biology
- G-G-G-C repeat-containing DNA quadruplexes
- G·C·G·C· tetrads
- G·G·G·G· tetrads
- Potential Na cation coordination sites