In many single-stranded (ss) RNA viruses, the cis-acting packaging signal that confers selectivity genome packaging usually encompasses short structured RNA repeats. These structural units, termed repetitive structural motifs (RSMs), potentially mediate capsid assembly by specific RNA-protein interactions. However, general knowledge of the conservation and/or the diversity of RSMs in the positive-sense ssRNA coronaviruses (CoVs) is limited. By performing structural phylogenetic analysis, we identified a variety of RSMs in nearly all CoV genomic RNAs, which are exclusively located in the 5′-untranslated regions (UTRs) and/or in the inter-domain regions of poly-protein 1ab coding sequences in a lineage-specific manner. In all alpha- and beta-CoVs, except for Embecovirus spp, two to four copies of 5′-gUUYCGUc-3′ RSMs displaying conserved hexa-loop sequences were generally identified in Stem-loop 5 (SL5) located in the 5′-UTRs of genomic RNAs. In Embecovirus spp., however, two to eight copies of 5′-agc-3′/guAAu RSMs were found in the coding regions of non-structural protein (NSP) 3 and/or NSP15 in open reading frame (ORF) 1ab. In gamma- and delta-CoVs, other types of RSMs were found in several clustered structural elements in 5′-UTRs and/or ORF1ab. The identification of RSM-encompassing structural elements in all CoVs suggests that these RNA elements play fundamental roles in the life cycle of CoVs. In the recently emerged SARS-CoV-2, beta-CoV-specific RSMs are also found in its SL5, displaying two copies of 5′-gUUUCGUc-3′ motifs. However, multiple sequence alignment reveals that the majority of SARS-CoV-2 possesses a variant RSM harboring SL5b C241U, and intriguingly, several variations in the coding sequences of viral proteins, such as Nsp12 P323L, S protein D614G, and N protein R203K-G204R, are concurrently found with such variant RSM. In conclusion, the comprehensive exploration for RSMs reveals phylogenetic insights into the RNA structural elements in CoVs as a whole and provides a new perspective on variations currently found in SARS-CoV-2.
All Science Journal Classification (ASJC) codes