Visualizing COVID-19
- Topic: Data visualization
- Programming language: R
- Packages: dplyr, ggplot2, readr
- Algorithms Used: -
- Project URL: Github
In December 2019, COVID-19 coronavirus was first identified in the Wuhan region of China. By March 11, 2020, the World Health Organization (WHO) categorized the COVID-19 outbreak as a pandemic. A lot has happened in the months in between with major outbreaks in Iran, South Korea, and Italy. We know that COVID-19 spreads through respiratory droplets, such as through coughing, sneezing, or speaking. But, how quickly did the virus spread across the globe? And, can we see any effect from country-wide policies, like shutdowns and quarantines?
Fortunately, organizations around the world have been collecting data so that governments can monitor and learn from this pandemic. Notably, the Johns Hopkins University Center for Systems Science and Engineering created a publicly available data repository to consolidate this data from sources like the WHO, the Centers for Disease Control and Prevention (CDC), and the Ministry of Health from multiple countries.
The dataset is taken from Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus repository which includes the cases between January 22nd to March 17th, 2020. Overall, the dataset includes the date, country, type of case, number of daily cases, and many other attributes.
We used readr, ggplot2, and dplyr packages to manipulate data frames and make plots in R.
Even though the outbreak was first identified in China, there is only one country from East Asia (South Korea) in countries with the most confirmed cases. Four of the listed countries (France, Germany, Italy, and Spain) are in Europe and share borders.