In this lab, I aimed to make a data visualization work about covid-19 test numbers（Including number of conducted tests, confirmed cases, death and hospitalization.）
3 data visualization work provided inspiration to this project. Among them, my favorite one is Interactive Data Visualization for exploring Coronavirus Spreads by B. Chen (Pic.1), which is an interactive compound chart for exploring Coronavirus Spreads. As you can see below, the one on the top shows the daily new cases and the one at the bottom shows the total number of new cases for the selected area. Chen’s Chart is an excellent example of time-based data visualization works about Covid-19 Data to me.
Bing’s Covid-19 Tracker (Pic 2) is another work I consulted with. While I finally used a different way to present a timeline in my lab work, Covid-19 Tracker help me to think more about how to present a timeline of 1 to 2 years.
The final work I consulted with is the lab 2 example posted by professor Chris Sula.(Pic 3) This is my first time to work with Tableau and this example helped me to understand more about what a Tableau data visualization worksheet would look like, and what a Tableau data visualization project would look like.
– Dataset: COVID-19 Outcomes by Testing Cohorts: Cases, Hospitalizations, and Deaths from NYC Open Data.
-Tool: Google spreadsheet, Tableau
I started by searching on datasets related to the topic, and finally decided on the dataset: COVID-19 Outcomes by Testing Cohorts: Cases, Hospitalizations, and Deaths from NYC Open Data. The dataset shows outcomes (confirmed cases, hospitalizations, and deaths) for cohorts defined by each date of specimen collection. Here are two side notes on this step
– I chose this dataset because 1. the size of this dataset is optimal (I tried to fetch a larger dataset before this one. Unfortunately, it took 5 hours for me to download it and it was impossible to open it in Open Refine.) 2. the dataset is more up-to-date compared to other datasets I found related to the topic.
-On this step, choice of dataset is not optimal. Read more in Reflection section.
I then cleaned data in google spreadsheet and download the CSV file to import into Tableau.
To start with Tableau，I first made a timeline-based data visualization graphic about number of covid tests since 2020 March to 2021 March. In this sheet, I filtered time with starting date of weeks for 2 reasons:
1. I want to show the audience as much information as I can while I don’t I want to overload them with too much information.
2. While months and year would also be a good combination, I failed to find out a way to show both the month and year info together in a logical way.
The overall purpose of this map is to introduce how covid-19 test number changed in the period of time. For that reason, I highlighted some key points with annotations. I also used mark to show when covid 19 test declines most fastly(Surprisingly, the last week of February 2021!)
And then, I created a new worksheet to make another info-graphic for test number fluctuations in the period of time. I created this graphic to provide extra info based on my first worksheet.
I used my last worksheet to present numbers of confirmed cases and deaths. I put them together because I want the audience to have a bigger picture about confirmed-case & death relationships while reading the infographic. To make the graphic more readable, I marked the highest and lowest number in this graph.
On the final step, I combined all the 3 worksheets together to create the dash board you see below. I want my audience to see Number of Covid-19 Tests at a first sight, and then they can explore more into the dashboard for extra information.
Choice of Data Set
As mentioned before, the dataset I chose in this lab is not an optimal one. The main reason I chose it was because of the size. However, most data of this dataset is numerical data and there are no categories inside of this dataset. For this reason, this dataset is not a perfect fit of Tableau.
I do wonder: 1. What is the right size range for a CVS file？ 2. For the data set I have, is there any way to manipulate with it to create categories?
The purpose of my work of this lab is to tell a time-based story to the audience. Fortunately, based on my peer review result, my peer partener found my dashboard telling an interesting story. However, when I reviewed it later, I found several points to improve:
-I should have highlighted important numbers not only with paragraphs but also font size.
-I wonder if I can do a better job giving titles to a number based info graphic.
-I wonder if there are other ways to organize different types of charts together to tell a better story.