Data Collection
The lifestyle of New York City is so hectic. The hectic lifestyle takes a toll on health as well. And every day and night there are ambulance running on the streets of New York. This made me wonder about the no. of deaths in New York. I found interesting dataset about it on NYC open data. The data set gave information about the death in measures of diseases or cause of death, varied ethnicities and genders. But before looking at the data set few following questions popped into my head.
- What is the major cause of death in NYC?
- What gender is facing more deaths ?
- What are the future death rates in response to the data collected ?
To get answers to my questions and to visualize the data, the tools I used for this particular project are Microsoft excel, Open Refine and Tableau.
Clean-up Data
The data I got from NYC Open Data was pretty clean. I used Open Refine to clean and transform the data. convert the year into dates as tableau needs them in that format. There were few causes that were repeated so I combined them. The gender classification was not constant. The division was done into F, Female, M and Male. these I converted them into Male and Female. Also the ethnicity classification was not constant. Open refine helped me a lot to make the data more consistent.
Inspiration
After sorting out the data, I looked into various death related visualizations for the inspiration of my visualizations. For the inspiration I used google and pinterest. These inspirations helped me a lot in choosing the kind of graph for different data and colour schemes. like the following picture a line graph will help me understand the major death according to ethnicity and gender. The colour difference in line graph can make us under the different ethnicity and gender.
After looking at the inspiration given below I realized a bubble diagram would be a best option to tell the major diseases that cause death. The different sizes of bubbles will make us understand which is the major and minor death causing disease.
Data Visualization Results
For the visualization of data, I used tableau Public. It is an online platform to convert data into graphs. In my first visualization i found out the answer to my first question of which is the major case of death in NYC. hence i used a bubble diagram to answer the question. With the use of no of deaths and the causes of death given in the data, I made the visualization. This gave me an answer that the diseases related to heart are the major death causing disease. The smaller the size the small no of deaths caused by the disease.
In the second visualization i used line graph to answer my question of cause of death in case of the gender. With the use of no of deaths, gender and years of information from the data i created the line graph. This made me realise the death of women in more than that of men.
Similarly in case of death rate according to ethnicity I used line graph as well. The gradient of blues and lines made me understand that the white non-hispanic community has major number of deaths from the year 2007-2015.
As an additional information, I made a heat map of all the major death causing diseases with respect to years. I used a gradient of blue to red where blue is minimum number of deaths and red is max number of deaths. This made me realize that throughout all these years, diseases of heart and cancer has major number of deaths.
Reflection
In this whole process I understood more about data reading. The whole experience with tableau public was very new to me. Previously I thought the graphs made in papers or magazines were made with excel sheets. But tableau makes the whole process very easy. But firstly understanding if the data set I have chosen is relevant or not was a big question for me. After that understanding the data was quite a task. I did not understand what few columns meant. But thankfully most relevant columns were easy to understand. Hence make and understanding the visualization made on tableau was easy. The options that it gives you to make data makes the whole process simple. But I personally think I could have made more vast visualization if I get more data set on this topic. Overall the whole experience for me was great and learned a lot.
Following is the link to the visualizations on Tableau :