TY - JOUR
T1 - Structural phylogenetic analysis reveals lineage-specific RNA repetitive structural motifs in all coronaviruses and associated variations in SARS-CoV-2
AU - Chen, Shih Cheng
AU - Olsthoorn, René C.L.
AU - Yu, Chien Hung
N1 - Publisher Copyright:
© 2021 The Author(s) 2021. Published by Oxford University Press.
PY - 2021/1/1
Y1 - 2021/1/1
N2 - In many single-stranded (ss) RNA viruses, the cis-acting packaging signal that confers selectivity genome packaging usually encompasses short structured RNA repeats. These structural units, termed repetitive structural motifs (RSMs), potentially mediate capsid assembly by specific RNA-protein interactions. However, general knowledge of the conservation and/or the diversity of RSMs in the positive-sense ssRNA coronaviruses (CoVs) is limited. By performing structural phylogenetic analysis, we identified a variety of RSMs in nearly all CoV genomic RNAs, which are exclusively located in the 5′-untranslated regions (UTRs) and/or in the inter-domain regions of poly-protein 1ab coding sequences in a lineage-specific manner. In all alpha- and beta-CoVs, except for Embecovirus spp, two to four copies of 5′-gUUYCGUc-3′ RSMs displaying conserved hexa-loop sequences were generally identified in Stem-loop 5 (SL5) located in the 5′-UTRs of genomic RNAs. In Embecovirus spp., however, two to eight copies of 5′-agc-3′/guAAu RSMs were found in the coding regions of non-structural protein (NSP) 3 and/or NSP15 in open reading frame (ORF) 1ab. In gamma- and delta-CoVs, other types of RSMs were found in several clustered structural elements in 5′-UTRs and/or ORF1ab. The identification of RSM-encompassing structural elements in all CoVs suggests that these RNA elements play fundamental roles in the life cycle of CoVs. In the recently emerged SARS-CoV-2, beta-CoV-specific RSMs are also found in its SL5, displaying two copies of 5′-gUUUCGUc-3′ motifs. However, multiple sequence alignment reveals that the majority of SARS-CoV-2 possesses a variant RSM harboring SL5b C241U, and intriguingly, several variations in the coding sequences of viral proteins, such as Nsp12 P323L, S protein D614G, and N protein R203K-G204R, are concurrently found with such variant RSM. In conclusion, the comprehensive exploration for RSMs reveals phylogenetic insights into the RNA structural elements in CoVs as a whole and provides a new perspective on variations currently found in SARS-CoV-2.
AB - In many single-stranded (ss) RNA viruses, the cis-acting packaging signal that confers selectivity genome packaging usually encompasses short structured RNA repeats. These structural units, termed repetitive structural motifs (RSMs), potentially mediate capsid assembly by specific RNA-protein interactions. However, general knowledge of the conservation and/or the diversity of RSMs in the positive-sense ssRNA coronaviruses (CoVs) is limited. By performing structural phylogenetic analysis, we identified a variety of RSMs in nearly all CoV genomic RNAs, which are exclusively located in the 5′-untranslated regions (UTRs) and/or in the inter-domain regions of poly-protein 1ab coding sequences in a lineage-specific manner. In all alpha- and beta-CoVs, except for Embecovirus spp, two to four copies of 5′-gUUYCGUc-3′ RSMs displaying conserved hexa-loop sequences were generally identified in Stem-loop 5 (SL5) located in the 5′-UTRs of genomic RNAs. In Embecovirus spp., however, two to eight copies of 5′-agc-3′/guAAu RSMs were found in the coding regions of non-structural protein (NSP) 3 and/or NSP15 in open reading frame (ORF) 1ab. In gamma- and delta-CoVs, other types of RSMs were found in several clustered structural elements in 5′-UTRs and/or ORF1ab. The identification of RSM-encompassing structural elements in all CoVs suggests that these RNA elements play fundamental roles in the life cycle of CoVs. In the recently emerged SARS-CoV-2, beta-CoV-specific RSMs are also found in its SL5, displaying two copies of 5′-gUUUCGUc-3′ motifs. However, multiple sequence alignment reveals that the majority of SARS-CoV-2 possesses a variant RSM harboring SL5b C241U, and intriguingly, several variations in the coding sequences of viral proteins, such as Nsp12 P323L, S protein D614G, and N protein R203K-G204R, are concurrently found with such variant RSM. In conclusion, the comprehensive exploration for RSMs reveals phylogenetic insights into the RNA structural elements in CoVs as a whole and provides a new perspective on variations currently found in SARS-CoV-2.
UR - http://www.scopus.com/inward/record.url?scp=85110264150&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110264150&partnerID=8YFLogxK
U2 - 10.1093/ve/veab021
DO - 10.1093/ve/veab021
M3 - Article
AN - SCOPUS:85110264150
VL - 7
JO - Virus Evolution
JF - Virus Evolution
SN - 2057-1577
IS - 1
M1 - veab021
ER -