ISSN: 2322-0066
Tahoora Mousavi1*, Monireh Golpour1, Reza Valadan1,2, Reza Alizadeh Navaei3, Mehryar Zargari4, Mehrdad Gholami5, Mohammadreza Haghshenas6
1Department of Molecular and Cell Biology, Mazandaran University of Medical Sciences,Sari, Iran
2Department of Immunology, Mazandaran University of Medical Sciences, Sari, Iran
3Department of Gastro Intestinal Cancer, Mazandaran University of Medical Sciences, Sari, Iran
4Department of Biochemistry, Genetic, Molecular and Cell Biology, Mazandaran University of Medical Sciences, Sari, Iran
5Department of Microbiology and Virology, Mazandaran University of Medical Sciences, Sari, Iran
6Department of Microbiology, Molecular and Cell Biology, Mazandaran University of Medical Sciences, Sari, Iran
Received: 21-Jan-2023, Manuscript No. JOB-23-87643; Editor assigned: 24-Jan-2023, PreQC No. JOB-23-87643 (PQ); Reviewed: 07-Feb-2023, QC No. JOB-23-87643; Revised: 21-Mar-2023, Manuscript No. JOB-23-87643; Published: 30-Mar-2023, DOI: 10.4172/2322-0066.11.1.006
Visit for more related articles at Research & Reviews: Research Journal of Biology
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the causative agent of Coronavirus disease 2019 (COVID-19). The high mutation rate of RNA viruses causes genetic variation, virus evolution and it is a strategy to escape the immune system. In the present study, all researches and evidence were extracted from the available online national databases. Two researchers randomly evaluated the assessment of the research sensitivity. Finally, after quality assessment and specific inclusion and exclusion criteria, the eligible articles were entered for meta-analysis. The heterogeneity between the results of studies was measured using test statistic (Cochran's Q) and I2 index. The forest plots illustrated the point and pooled estimates with 95% confidence intervals (crossed lines). All statistical analyses were performed using comprehensive meta-analysis V.2 software. This meta-analysis included 13 primary studies investigating the SARS-CoV-2 genetic variations and mutations in the COVID-19 genomic sequence. According to the pooled prevalence (95% confidence interval) of mutations, the spike gene variations showed the highest non-synonymous mutation frequency (16.4%, CI: 13.6, 16.6) and the Non-Structural Protein (NSP) genes possess the highest mutation frequency among total mutations (31.6%, CI: 21, 44.6). Genomic mutation analysis of SARS-CoV-2 strains may provide knowledge about different biological infrequent mutations and their relationships of viral transmission, pathogenicity, infectivity, and fatality rates between SARS-CoV-2 and human cells.
Genetic Variation; SARS-CoV-2; Mutation; COVID-19 sequences
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) the causative agent of coronavirus disease 2019 (COVID-19), poses a foremost challenge to public health. Since the primary appearance of SARS-CoV-2 in late December 2019 in Wuhan, Hubei province, central China, a high dissemination rate has been observed worldwide [1]. Based on information released from the World Health Organization (WHO) on 29 December 2020, the present pandemic COVID-19 has nearly 79 million confirmed cases worldwide and over 1.7 million. The SARS-CoV-2 is classified in the family of Coronaviridae, the order of Nidovirales and the genus Betacoronavirus. Similar to other coronaviruses, the genome of SARS-CoV-2 consists of specific genes encoding some structural/non-structural proteins. Mutation level among RNA viruses is notably high, which this phenomenon is essential for viral adaptation [2]. Though, coronaviruses have been introduced to have proofreading systems and so, nucleotide sequence variety in SARS-CoV-2 has been observed at a very low level. In a study reported the presence of 13 variations site in Open Reading Frames (ORF) of SARS-CoV-2 in 1a, 1b, S, 3a, M, 8 and N regions, which among them positions nt28144 and nt8782 in ORF 8 and ORF 1a indicated mutation rate of 30.53% and 29.47%, respectively [3]. In addition, based on the evidence obtained from a study on 48,635 SARS-CoV-2 sequences, 353, 341 mutations have been detected throughout the world. Among them, D614G mutation in C-terminal of the spike protein (aspartate to glycine substitution at position 614) is one such evolutionary alteration detected in the SARS-CoV-2 and has become the most common type reported in many regions of the world such as Europe, Oceania, South America and Africa. The present study was aims to assess the prevalence of SARS-CoV-2 genetic variation and mutation in COVID-19 sequences [4].
Genetic diversity and mutations of the COVID-19
There are several reports of unusual public health due to variants of SARS-CoV-2, which changes in transmissibility, clinical features, and severity. Shows the list of significant mutations in the world (Table 1).
Name of variant or mutation | Time of mutation | Area | Location | Out comes |
---|---|---|---|---|
D614G | Early February 2020 | China | Spike protein | D614G have indicates greater transmissibility in humans rather than greater pathogenicity. D614G produce higher viral loads. D614G variant more susceptible to neutralizing antibodies and does not causes serious disease or alter the efficacy of vaccines. |
SARS-CoV-2 VOC 202012/01 or B.1.1.7 or 20B/501Y.V1 |
December 2020 | UK and 31 other countries | RBD | Increased transmissibility No change in disease severity. No evidence that this variant has any impact on the vaccine. Mutation of N501Y is detected in B.1.1.7. |
B.1.351 or 501Y.V2 | 18 December, 2020 30 December, 2020 | South Africa other countries |
RBD | Higher viral load Increased transmissibility No associated with more severe disease or worse outcomes. K417N mutation effect on monoclonal and poly clonal antibody. Mutation of N501Y, E484K and K417N are detected in B.1.351. E484K makes the vaccine less effective against it. |
B.1.1.248 (P.1) or 501Y.V3 | January 2021 | Tokyo and 3 other countries (Brazilian) | RBD | This variant contain N501Y (More transmission) and E484K (Escape of antibody) and K417N. Effective on the production of antibody, vaccination or virus neutralization. |
N439K | March 2020 October 2020 | Wuhan, Europe 12 countries |
RBD | Escape from immune system. SARS-CoV-2 Bind to Human ACE2 more strongly than original strain. N439K escape from polyclonal and neutralizing antibody responses. |
A.EU1 | June 2020 | Spain, UK and 12 countries | spike protein | It is not clear for increasing of the transmissibility of the virus. Mutation of A222V and A220V are detected in A.EU1. Spike mutations A222V had a functional effect on spike’s ability to mediate cell entry. Less effective against vaccine. |
Cluster 5 | August and September 2020 | Denmark | spike protein | Decrease the duration of immune protection following natural infection or vaccination Cluster 5 variant identified only in 12 human cases and this variant dose not spread widely Might effect on vaccine development. |
RBD: Receptor Binding Domain
Table 1. The list of significant mutations in the world.
Search strategy
In the present study, the search strategy was done using available online national databases, including ISI, Science direct, Scopus, Pubmed, Wiley and Google scholar between December 2019 and March 2021. The search was performed based on appropriate keywords of SARS-CoV-2, variation, mutation and COVID-19 sequences, which were combined with and/or/not to determine and screen articles in the search strategy [5]. Besides, it is investigated the references of the published studies to improve the sensitivity of the search. The assessment of the research was randomly evaluated by two researchers and confirmed that all suitable studies had been detected [6,7].
Study selection
At first, articles of all researches, evidence or reports were extracted from the electronic database. After examinations of studies, duplicate articles were identified and removed from the study. Then, after analyzing the articles, the irrelevant articles were excluded by reviewing of title, abstract, and full text. Also, articles screened for eligibility and review articles and articles published in other languages were extracted from this study.
Quality assessment
The PRISMA checklist was used for evaluation of the quality of the related studies and determination of the selected studies based on title and contents [8]. The PRISMA checklist consists of 27 items covering different aspects of research methodology such as determining Protocol and registration, eligibility criteria, search, study selection, defining variables, method of data collection, risk of bias in individual studies, presentation of results and statistical tests. Each question was required one score [9].
Inclusion/Exclusion criteria
All articles approved by the above assessment phases were considered eligible for final meta-analysis:
•All English studies.
•Studies based on the prevalence of SARS-CoV-2 genetic variation among total mutation.
•Reported prevalence of SARS-CoV-2 genetic variation among non-synonymous mutation.
The following studies were ruled out:
•Duplicated studies.
•Non-relevant articles.
•Article with non-full length sequence.
•Abstracts, letters or review studies.
•Studies published in languages other than English.
•Articles with no access to the full text.
Data extraction
After selection of appropriate articles, the following data for each research were extracted based on first author’s name, geographical regions, publication year, language, the number of total mutations, non-synonym mutations, mutation in S-protein, mutation in N protein, mutation in M protein, mutation in E protein, ORF 1a/1b, ORF 3a, ORF 7a, ORF 7b, ORF8a, ORF 10a, ORF6, ORF 1a and NSP. The data were extracted and entered into a Microsoft Excel spread sheet [10].
Statistical analysis
The primary outcome was the SARS-CoV-2 genetic variation and mutation in COVID-19 sequences. In our research, the heterogeneity between the results of studies was measured using the test statistic (Cochran's Q) and the I2 index. P-value less than 0.1 were used to consider significant heterogeneity. The forest plots illustrated the point and pooled estimates with 95% confidence intervals (crossed lines). Each box in a forest plot indicated the study's weight [11]. The heterogeneity and homogeneity of the suspected factors were performed using random and fixed effects models, respectively and more than 50% were considered as high degrees of heterogeneity. All statistical analyses were performed using comprehensive meta analysis V.2 software [12].
In the present study, 1370 articles were identified in the starting process. The number of studies was reduced to 1209 following the removal of duplicate articles. In the next step, 890 irrelevant documents were removed after reviewing the full texts [13]. Then, 319 articles were considered for further screening. After the exclusion of 291 articles, 28 articles were assessed for eligibility. 6 articles with non-full length sequence, 8 review articles and one article with other languages were excluded. Finally, 13 relevant articles were included in the meta-analysis review (Figure 1). In addition, the geographic distribution and frequently mutated residues among COVID-19 sequences are shown in Table 2 and Figure 2 respectively.
In addition, the geographic distribution and frequently mutated residues among COVID-19 sequences are shown in and respectively (Figures 1 and 2 and Tables 2-4).
Strain name | Location | Mutation position |
---|---|---|
hCoV-19/Singapore | ORF 1ab | C8517T, T17459C (V5820A) T2449C (F817L) C176A (A59D) C595T (P199S) |
S | ||
ORF3a | ||
N | ||
ORF8 | ||
CGMH-CGU-02 (Taiwan) | ORF1ab S | C8517T, A16577G (K5526R) C145T (H49Y), C2651T (S884F) |
S | ||
ORF8 | ||
hCoV-19/Los Angeles | ORF1ab | F924F, P4715L D614G |
S | ||
hCoV-19/ Asia, Oceania, Europe, North America | ORF1ab | (1397 nsp2, 2891 nsp3, 14408 RdRp, 17746 and 17857 nsp143, 18060 nsp14), (23403, spike protein) (28881, nucleocapsid phosphoprotein) (nt 26143) (nt 28144) |
S | ||
ORF9a | ||
ORF3a | ||
ORF8a | ||
NCBI | ORF1ab | P4715L, L3606F D614G R203K/G204R, P13L, 203K/204R |
S | ||
N | ||
hCoV-19/GISAID | S | D614G, L5F, L8V/W, H49Y, Y145H/del, Q239K, V367F, G476S, V483A, V615I/F, A831V, D839Y/N/E, P1263L, |
hCoV-19/Singapore | ORF7b | 382-nt deletion |
ORF8 | ||
hCoV-19/Australia | ORF7b | 138-nt deletion |
ORF8 | ||
hCoV-19/Bangladesh | ORF8 | 345-nt deletion |
hCoV-19/Spain | ORF8 | 62-nt deletion |
hCoV-19/Italy | ORF 1ab | (S443F, H3076Y, L3606F, P4715L, E5689D, R5919K) (D3G) (G70C) (A570D, D614G, G1046V) (G251V) (R203K-G204R, V246I) |
M | ||
ORF 7a | ||
S | ||
ORF 3a | ||
N | ||
GISAID database | ORF 1ab, ORF 1a,ORF8 | (nsp12, nsp13, RdRp) (nsp2, nsp6) |
S, ORF 3a, N | ||
Bangladesh | S | I300F (nsp2), P4715L (nsp12), D614G R203K, G204R (N protein) |
N | ||
Indian states | 2'-O-ribose methyltransferase | N298L V871I, A88V, P314L P1103L, S1285F, S1197R, A994D, T1198K D279N, L37F. A380V, G339S, Q496P, S202N T372I , L177F L46F, Q57H L84S L54F, D614G P13L, S194L, RG203KR |
RNA-dependent RNA polymerase | ||
Predicted phosphoesterase, papain-like proteinase | ||
Transmembrane protein | ||
NSP | ||
3'-to-5' exonuclease | ||
ORF3a | ||
ORF8 | ||
S | ||
NP | ||
South American | ORF 1b | D614G, E1207V G392D, T708I, I739V, P765S, A876T, A1043V, N2894D, F3071Y, G3334S, L3606F Q57H, G196V, G251V T175M L84S D103Y, R191C, S197L, R203K , G204R, G238C |
ORF 1a | ||
ORF 3a | ||
M | ||
ORF 8 | ||
N | ||
GISAID database | 3’UTR | G204R-S194L, R203K, S202N L84S-Q57H D614G, A879S A1812D (nsp3), L3606F (nsp6), P4715L (RdRp) |
5’UTR | ||
N | ||
ORF 8 | ||
M | ||
ORF 3 | ||
S | ||
ORF1ab | ||
Northern Vietnam | S | L54F, S254F, C1250F, D614G Q57H, G251V D3G, V70F S81L, L96F, L102_I103del R203K; G204R, S180I, A211V, Q283H Gly82_Val86del, Met85del (nsp1), T85I, G212D, 559V, P585S (nsp2) A58T, T428I, R646W, L672F, G730D, P1103L, K1186R, M1901I (nsp3), D477N (nsp4), G15S (nsp5), L37F (nsp6), D161V, P323L, V338F (nsp12) R595S(nsp13) V320L(nsp15) P134S, T140I (nsp16) |
ORF 3a | ||
M | ||
ORF 7a | ||
ORF 7b | ||
N | ||
South-East Asia | S | D614G R203K, G204R, P13L, Q57H, NS8_L84S, L37F, P323L A97V, T1198K |
N | ||
NSP | ||
Morocco | S | D614G |
Saudi Arabia | S | P97 L, T424I, C1313S, W553R, S950T, R700L, S191 P, S459T, V26 L, Q1009L S733R/E736 K, F1609 L, P1883S, M4574I, V51I, T649I, P777S, A1045 V, V1202I, E1835A, M2119I, P2742S,G3117E/V3120D, F5011I, R5027S/D5028Y/R5029 K, G5061R, H243C, H73N, 167 T127I W293C, A300V, V178A, S11 F, L283 F, G28 V, D242E, V263A, P7L, G198S, R292 P, S391I, G, L3785F, A5362 L, V117D, D2204A, V5551A, E102C, T5560S, V6030 F, T636I, G981S, V1375I, A1462 T, F3098 L/T3101N, F5072 L, I5559 M/T5560 F/G5561W/L5562 V/Y5563 V, I6668 M, V6431F, G85D/P86 F/T87N, K1255R, H1714Y, N4385 K, I4611 V, A5764S, M6272I, L6373R, L6958 P, V62F, G85D/P86F, A218S, T586S, P1099R, E1428 G, A1301 V, P1971S, S2111R, K5019 N, L6412 F, P86L, I147L T127I W293C, A300V, V178A, S11 F, L283 F, G28 V, D242E, V263A, P7L, G198S, R292 P, S391I, |
ORF | ||
M | ||
N | ||
GISAID database | ORF 1ab | P4715L, Y232C, F1657L, A1906V, V1973L, G2374R D58E, L952P, E955K, S1498F, N1559T, A3203V, G4227R, A4297G, F4304L Y145del, N354D, D364Y, R416I, S438F, Y508H, D614GG, D3G, T175MA31T, Q57HGH, V88L, H93Y, G196V, G251VV, Q675H, T791I, F797C, A930V, I1216T, P1263L, V74F, S81L V62L, L84SS, L121H, T148I, S193I, S197L, R203KGR, G204RGR, I292T |
ORF 1a | ||
S | ||
ORF 3a | ||
M | ||
ORF7a | ||
ORF8 | ||
N | ||
GISAID database | S | D614G P214L G251V, Q57H L84S R203K, G204R |
ORF 1b | ||
ORF 3a | ||
ORF8 | ||
N | ||
ORF 1a, M, ORF6, ORF7a, ORF7b, ORF10 | ||
NCBI database | ORF 1ab | D75E, T265I, P971L,L3606F, P4715L, V5550L, P5828L, Y5865C, F6158L D614G Q57H, G251V S24L, V62L, L84S R203K, G204R |
S | ||
ORF 3a | ||
ORF 8 | ||
N | ||
GISAID database | ORF 1ab | M4555T, T4847I, T5020I, V5661A, P5703L, M5865V, G3278S, K3353R, I6525T, Ter6668W, A876T, T1246I, S5932F, F3071Y, V483A D3G, T175M S197L, S202N, R203K, G204R S193I, S194L, S197L, S202N, R203K, G204R V62L, L84S |
S | ||
M | ||
ORF 3 | ||
N | ||
ORF 8 |
Table 2. Geographic distribution of mutant variants of SARS-CoV-2.
First author | Language | Area of study | Non synonymous sample size | S% | N% | M% | ORF 1a/1b % | ORF 3a% | ORF 7a% | ORF 7b% | ORF 8a% | ORF 10a% | ORF6% | E% |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Gupta | English | GISAID | 47 | 27.7 | 14.89 | 4.25 | 12.8 | 12.8 | 4.25 | NA | 4.25 | NA | NA | NA |
Alessia | English | Italy | 159 | 11.9 | 3.77 | 2.51 | 70.4 | 7.54 | 1.88 | 0.62 | 0.62 | 0.62 | NA | NA |
Kumar | English | Indian | 4648 | 19.8 | 19.16 | NA | NA | 6.92 | NA | NA | 1.48 | NA | NA | NA |
Kim | English | GISAID | 1352 | 13.5 | 8.8 | 1.55 | NA | 5.76 | 2.58 | 0.59 | 2.36 | 0.81 | 1.4 | 0.88 |
Hasan | English | Bangladesh | 1602 | 15.6 | 36.14 | 0.811 | 39.5 | 3.3 | 2.18 | 0.06 | 1.87 | 0.18 | 0.2 | 0.12 |
Jin | English | Zhejiang | 37 | 27 | NA | 13.51 | NA | NA | NA | NA | NA | NA | NA | NA |
Laha | English | NCBI | 351 | 12.5 | 7.4 | 1.42 | 67 | 5.69 | 1.7 | 0 | 2.27 | 0.56 | 0.9 | 0.56 |
Islam | English | South-East Asia | 78 | 16.7 | 11.53 | 3.84 | NA | NA | NA | NA | NA | NA | NA | 1.2 |
Table 3. Frequency of mutations among non-synonymous mutation included in meta-analysis.
First author | Language | Area of study | Total mutation sample size | S% | N% | M% | ORF 3a% | ORF 7a% | ORF 7b% | NSP % |
---|---|---|---|---|---|---|---|---|---|---|
Biswas | English | GISAID | 504 | 16.26 | 7.14 | 1.78 | 3.96 | N/A | N/A | N/A |
Wang | English | GISAID | 4796 | 1.2 | 0.07 | 2.18 | 4.81 | 1.83 | 0.2 | 44.14 |
Nguyen | English | GISAID | 167 | 26.34 | 23.95 | 1.19 | N/A | 2.39 | 1.19 | 40.11 |
Utsav | English | GISAID | 273 | 15.38 | N/A | N/A | N/A | N/A | N/A | N/A |
Nguyen | English | GISAID | 171 | 25.73 | 23.39 | 1.16 | 4.67 | 2.33 | 1.16 | 41.52 |
Table 4. Frequency of mutations among total mutation included in Meta-analysis.
Analysis of mutations among non-synonymous mutation
In the current study, the prevalence of S,N,M,E,ORF 1a/1b,ORF 3a,ORF 7a,ORF 7b,ORF 8a,ORF 10a and ORF 6 mutations among non-synonymous mutation is varied from 0.06% (ORF7b) to 70.44% (ORF 1a/1b). Also, it is shown that the highest and lowest frequency of S,N,M,ORF 3a,ORF 7a,ORF 7b and NSP mutations among total mutation belongs to N (0.07%) and NSP (44.14%) respectively. In this review 8 sectional studies, S,N,M,E,ORF 1a/1b, ORF 3a, ORF 7a, ORF 7b, ORF 8a, ORF 10a and ORF 6 mutations were assessed among non-synonymous mutation [14].
Analysis of S mutation
Our analysis revealed that the D614G spike mutation has the highest frequency. This mutation improved spike protein fitness with cell surface receptors and increased the virus's transduction compared to the wild type. Other S mutations, P1263L, V483A, and L54F, have a low frequency [15]. The forest plot shows that the overall frequency of S mutation is 16.4% (13.6, 16.6) and with the compounding of the results, the overall prevalence of S mutation with the confidence interval of 95% and based on random effect model is (I²: 85.98%, Q=49.947, P<0.001). Also, the results of the heterogeneity studies show that there is heterogeneity among the primary results of the studies (Figure 3).
Analysis of N mutation
Other frequent mutations are R203K and G204R located in the N-area. N genes encode the nuclei capsid protein that contributes to the formation of helical ribonucleoproteins in the virus. These mutations modify m-RNAs' binding mechanism and changed the pathogenesis and development of COVID-19 infection in subjects. Other mutations in region N include S197L, P13L, L37F, P323L, and P1103L, which are less frequent, respectively [16,17]. As can be seen the total prevalence of N, mutations are estimated as 11.7% (7, 19.1). Generally, with the compounding of the results, the overall prevalence of N mutation with the confidence interval of 95 % based on the random effect model is (I²: 98.23%, Q=396.15, P<0.001). Besides, the results of the heterogeneity studies show that there is heterogeneity among the initial results of the studies (Figure 4).
Analysis of M mutation
The M protein plays a part in the viral envelope packaging by interacting with the S protein. Our analysis revealed two low-frequent T175M and D3G mutations in the M gene. Accordingly, analysis of M mutation is calculated 1.9% (0.9, 4.1). The overall prevalence of M mutation with the confidence interval of 95 % based on the random effect model is (I²: 84.70%, Q=45.76, P<0.001). The results of the heterogeneity studies describe that there is a heterogeneity among the result of these studies (Figure 5).
Analysis of ORF1a/1b mutation
ORF1ab is a large gene that coded poly protein (16 proteins) involved in virus genome synthesis and replication. P4715L, L3606F, C8517T, A876T and F3071Y mutations are more frequent in ORF1ab. Due to the overall distribution of ORF 1a/1b mutation 12.8% (5.7, 26.4) with the confidence interval of 95 % based on random effect model is (I²: 97.09%, Q=240.66, P<0.001) and it is shown that there is a heterogeneity among the results of the studies [18] (Figure 6).
Analysis of ORF3a mutation
Q57H, G251V, S193I, and G196V are more frequent mutations in ORF3a. ORF3a proteins are located in host cells and found in the endoplasmic reticulum or Golgi intermediate space, acting as ion channels and controlling the virus's release. Moreover, ORF3a triggers pro-inflammatory pathways and assists in severing modes of infection [19]. It is noteworthy that the ORF3a gene shows a high level of non-synonymous and neutral mutations with a potential effect on B-cells like epitope generation that is a significant point. The incidence of non-synonymous mutation according to ORF 3a group by 95% confidence interval in different studies is shown in the forest plot 5.7% (4.3,7.6). The results of the analysis demonstrated that the heterogeneity among reported studies is (P<0.001; I²=78.67%, Q=32.81) (Figure 7).
Analysis of ORF7a mutation
Test results of forest plot shows that the average rate of ORF 7a is reported to be 2.1% (1.3,3.3) and the overall prevalence of ORF 7a mutation with the confidence interval of 95% is (I²: 60.03%, Q=17.51, P=<0.014) so, there is a heterogeneity among these studies [20] (Figure 8).
Analysis of ORF7b mutation
The forest plot shows the prevalence of the non-synonymous mutation based on ORF 7b mutation and confidence intervals (95% CI). The average frequency of ORF 7b mutation is estimated to be 0.4% (0.1,1.4). We observed heterogeneity (I2:72%, Q=25, P<0.001) among these studies (Figure 9).
Analysis of ORF8a mutation
In all non-synonymous mutation groups, the average rate of ORF 8a mutation is 1.8% (1.5,2.1). Based on analysis by 95% confidence interval on fixed effect model, there is no heterogeneity across these studies (I²: 29.82%, Q=9.97, P<0.190) (Figure 10).
Analysis of ORF10a mutation
According to the heterogeneity between the results of the studies, the overall prevalence of ORF 10a mutation 0.5% (0.2,1) with the confidence interval of 95 % based on random effect model is (I²: 50.84%, Q=14.24, P<0.047) (Figure 11).
Analysis of ORF6 mutation
Based on the heterogeneity for ORF 6 mutation (I²:74.18%, Q=27.11, P<0.001) using the random effects model, the prevalence of mutation is estimated as 0.7% (95% CI: 0.2,1.7) (Figure 12).
Analysis of E mutation
The heterogeneity indices show the heterogeneity between the primary results of E mutation. Therefore, the random effect model is applied for combining the results (I²:=56.68% Q=16.16, P<0.024). The pooled event rates for mutations of ORF6 are estimated as 0.4% (0.2,1.1) (Figure 13).
Analysis of mutations among total mutation
In the current meta-analysis, review 5 primary studies. S,N,M,ORF 3a,ORF 7a,ORF 7b and NSP mutations were examined among total mutations.
Analysis of S mutation: Based on the significant heterogeneity observed among the results (Q=45.6, P=0.000 and I²=91.12%), the pooled event rate (95% CI) of developing S mutation using random model was estimated as 18.4% (13.7, 24.4) (Figure 14).
Analysis of ORF3a mutation: The forest plot indicated that the overall frequency of ORF3a mutation is 3.9% (2.5,6) and with the compounding of the results, the overall prevalence of ORF3a mutation with the confidence interval of 95 % and based on random effect model is (I²: 60.33%, Q=10.08, P<0.039). Also, the results of the heterogeneity studies show that there is heterogeneity among the primary results of the studies (Figure 15).
Analysis of M mutation: The prevalence of total mutation according to the NSP group by 95% confidence interval in different studies is shown in the forest plot 31.6% (21,44.6). The results of the analysis manifest heterogeneity among reported studies (P=0.00; I²=90.47%, Q=42) (Figure 16).
Analysis of N mutation: According to the severe heterogeneity, the random effect meta-analysis is performed (P=0.00; I²=96.35%, Q=109.83). The overall mutation of N using the random effect model meta-analysis is 10.5% (95% CI; 5.1, 20.4) (Figure 17).
Analysis of M mutation: Heterogeneity indices for primary results for M were not statistically significant (I²: 16.55%, Q=4.79, P<0.309). Therefore, using fixed effect model, the event rate for M mutation was estimated as 2.1% (95% CI: 1.7, 2.5) (Figure 18).
Analysis of ORF7a mutation: More ever there was no significant heterogeneity between the results of primary studies regarding the effect of ORF 7a (I²: 46.89%, Q=7.53, P<0.11). The pooled event rate for ORF7a was estimated at 1.8% (95% CI: 1.5, 2.2) (Figure 19).
Analysis of ORF7b mutation: In this study, it is observed a great heterogeneity between the results of studies regarding the effect of ORF7b (I²: 58.07%, Q=9.54, P<0.049). Therefore, the random effects model was applied that estimated the pooled event rate for this mutation as 0.4% (95% CI: 0.1,1.2) (Figure 20).
Research on the variation in the SARS-CoV-2 genome sequence is necessary for the examination of disease course of COVID-19, disease progression, monitoring, controlling and treatment of SARS-CoV-2 infection. In this present study, the genome sequences of MERS-CoV-2 isolates were examined. The impact of epitope deletion among non synonymous mutations was the aim of this study which is related to immune escape and pathogenesis. Our study showed that according to the pooled prevalence (95% confidence interval) of mutations, the S variation was shown high frequency 16.4% (13.6, 16.6) among non-synonymous and NSP was the most common mutant among total mutation 31. 6% (21,44.6).
The high mutation of RNA viruses causes genetic variation, virus evolution and it is a strategy to escape the immune system and drug resistance. The SARS-CoV-2 complete genomes with different geographical locations are essential for detecting the genetic variations in the virus that causes viral shedding. Several genome variations in the SARS-CoV-2, such as nuclei capsid N protein, ORF4a and the surface protein S associated with the host immune system. Research indicates that genetic variations of SARS-COV-2 can transmit during the early stage of the epidemic; however, genomes are remarkably stable, and they are not able to evolve rapidly.
It is demonstrated that the fatality rate of COVID-19 can vary in different populations, and the level of virulence varies among humans. A larger number of specific mutations with a rapid transmission are detected in Italy, Spain and US and it is related to critical conditions. However, it is demonstrated that genome sequences of SARS-CoV-2 are similar with only a few mutations, but some countries such as North America and Europe are shown the heavily affected regions and Australia, Asia and Africa less affected with sequence variation. Research shows that the variation of RNA viruses is pivotal during an outbreak and it depends on nucleotide substitutions. Based on the viral transmission, the viral mutation rates vary in different viruses and help the virus in host adaptation.
Finding non-synonymous mutations through the database is useful for identifying mutations and their modes of transmission. There are some new variants such as (deletion 69-70, deletion 144, N501Y, A570D, D614G, P681H, T716I, S982A, D1118H) are defined in the spike protein of SARS-CoV-2. The novel mutation, (N501Y) which is found in the UK virus variant is located in the Receptor Binding Domain (RBD). The severity and infectious diseases of the UK variant remain unknown. In SARS-CoV-2 viruses, D614G, is a common mutation spike protein around the world. Also, it is proposed that the highest frequency of spike D614G mutation (S) may be associated with higher viral loads, cellular infectivity, infection severity and lethal outcome in COVID-19. The relation between high viral loads in the upper respiratory tracts and G clade is measured by RT PCR. It is suggested that the sensitivity of the G variant of SARS-CoV-2 spike to neutralizing antibody is more sensitive than D variant. It is reported that D to G mutation at position 614 (D614G) in the spike glycoprotein which is originated from Europe or China is a significant variation in changes of the secondary structure of protein. D614G mutation started in all affected regions such as Bangladesh (with 95.6% D614G mutation), Italy, Spain, North America and European countries, amino acid substitution 1109 (F→L) and 76th (S_T76I) position at spike protein found in Bangladeshi and Indonesian strain respectively. It is also suggested that mutation in RNA dependent RNA polymerase (RdRp) and D614G increase SARS-CoV-2 transmission and promote the infectivity of SARS-CoV-2. The study of 12,300 SARS-CoV-2 genome sequences from different countries reported that D614G and P4715L variation was associated with higher COVID-19 mortality.
It is evident that ORF1ab P4715L (nsp 12) plays a pivotal role in viral replication and it is reported that ORF1ab-V378I mutation is associated with COVID-19 infection in Taiwan, Australia and Germany. Also, three mutations, including (M5865V, S5932F) and (R203K) described in ORF1ab and N respectively. It is noticed that mutation in Nuclei capsid (N protein) (R203K and G204R) observed in Italy, Spain, India and France and also N_S202N mutant was detected in Saudi Arabia.
Our study shows that substitution in S protein (D614G) is the dominant variant in Asia, Oceania, Europe and North America mutant, Italy, Morocco and Saudi Arabia and led to severe respiratory infections and death in these regions. Genomic mutation analysis of SARS-CoV-2 strains may provide knowledge about different biological infrequent mutations and their relationships of viral transmission, pathogenicity, infectivity, and fatality rates between SARS-CoV-2 and human cells.
The authors declare there is no conflict of interest.
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]
[Crossref] [Google Scholar] [PubMed]