Validation Method of the Mathematical Model for SARS-Cov-2 Pandemic from Data Mining and Statistical Analysis


  • Rafael Pereira Santana
  • Anderson Lupo Nunes
  • Pedro Maia Salomone



Statistical analysis, Data mining, Epidemiology, Mathematical models, SARS-Cov-2, New coronavirus, Pandemic, Differential equations


Nowadays, in 2020, we live during the most dangerous global pandemic that has been reported since the Spanish Flu, which occurred between 1918 and 1920. According to World Health Organization (WHO) records, the pandemic caused by the Sars-Cov-2 virus began in December 2019 and is present in all continents and almost all countries, surpassing more than 79 million infected and 1.7 million deaths by December 2020. Several mathematical models applied to Epidemiology have been adopted over time. One of the most widely adopted is the Susceptible-Infected-Recovered (SIR), developed in India by Kermack and Mckendrick in 1927. In our research, a comprehensive collection of data on the SARS-Cov-2 pandemic was made from reports by WHO, Dadax Limited (Chinese data company), and Johns Hopkins University in the United States of America (USA). Facts were collected from many different countries, regarding the number of confirmed, recovered, and death cases. In this article, we constructed a mathematical model that describes the evolution of the pandemic from the similarity with models already adopted in the field of nuclear physics. For the validation of the mathematical model, we chose information from Germany due to the reliability of the available information. Thus, a statistical analysis was executed to qualify the performance of the method and the predictive character of the mathematical model. To date, 11,716 raw data have been collected, of which we performed data mining relevant to use in this research.