This project will highlight the rates of Seven Major Felony in New York City between 2019 and 2020. I will focus on changes in the violent Offense rate during this period. I will present visually, the decrease and or increase in violent Offense rate in these selected eight areas of criminal Offenses (Assault, burglary, felony, rape, robbery, murder & non-negligent. Manslaughter, Offenses against the person(not specific) and harassment) by way of charts and graphs. It is pertinent to note that the records used are from the NYPD dataset on their official website; records retrieved from 76 deferent precincts located in varied areas in NYC boroughs ( Kings, Queens, Bronx, Manhattan, and Staten Island).
Now, as a result of this project, we will be able to drill down And ask questions like:
Why is a specific Offense happening at this zip code?
Is there a way to alleviate these Offenses?
Is the precinct in that zip code competent enough? Are they providing the correct solution to the problem?
Although these questions will not be answered directly in the project, I think it is vital for us to look at these questions after reviewing the data presented.
1. Dataset from NYPD’s official website.
2. Openrefine to clean/refine the dataset.
3. Excel used for the last-minute purging of unwanted data.
4. Microsoft word for writing the full presentation and making the edits before adding it to the student workspace “studentwork.prattsi.org .”
5. “Book” Wilke, Fundamentals of Data Visualization, Chapters 21 “Multi-Panel figures”, Chapter 22 ” Titles, captions and tables,” Chapter 23 ” Balance the data and content.”
6. Tableau helps to present my dataset in a clear and meaningful way visually.
To complete this project, I have used the tools mentioned above to accomplish my goal, which is to highlight the Seven Major Felony Offenses over the years across all five boroughs in New York City seeing below. The first and most important tool is the official NYPD website. On this website, I was able to gather the data set needed for my visual presentation. This dataset was obtained via the (Offense stats) section on their NYPD website or this direct link attached. https://www1.nyc.gov/site/nypd/stats/Offense-statistics/historical.page.
The website below gives me the option to download open-source information in several different file formats like CSV, Excel, etc. Now, after the retrieval of this information (dataset) and the image map for the NYPD Precinct (GeoJson) at https://www1.nyc.gov/site/planning/data-maps/open-data/districts-download-metadata.page. I used Openrefine formally called Google Refine to clean up the dataset, which will provide more clarity to the information presented.
OpenRefine is a great tool” for cleaning” up “Messy data.” It can transform dataset from one format into another; even extends the dataset to web services and external databases. However, my purpose for using this powerful tool was to organize my data into a more comprehensive format. I did this by arranging the rows and columns and deleting the non-vital information captured in the original dataset retrieved from the NYPD official website. The information needed for this project was the; Offense/Felony, the Precinct number, years, and the number of times a specific Offense occurs. It’s great to know that OpenRefine keeps your data private on your computer until YOU want to share or collaborate. OpenRefine link attached https://openrefine.org/ and https://youtu.be/B70J_H_zAWM
After the use of OpenRefine, I used the popular tool, “Excel” to catch and adjust minor oversights, like deleting unwanted columns and rows and any other miner purging that is needed.
Because Tableau is a great way to analyze data into; Real-time analysis quickly, Data blending, and data collaboration, it is also a great way to translate queries into visualization. I used Tableau for my project because of its abilities generate dashboards, worksheets, and the inclusion of my Precinct Maps. I wanted to provide actionable analytical insights for my dataset, which Tableau allows me to do so by simply dragging and dropping the information I needed to visualize in Tableau structural pallets example into the row and column in its template.
As a direct result of my methodology, visualizing my data set was made easy. I started the process by getting the information needed, which is a report for the “Seven major felony Offenses in New York City” from the direct source, which is the NYPD’s website. I accessed their website, which houses open-source datasets for public use. I pulled the needed information to accomplish my task ahead. I focused on the information needed to present a great visualization of the data set. The information pulled from their website was: The names of the precincts which are Identify by numbers. Having the name/number of the precinct helps tremendously because it aids with the audience’s visual clarity of the data being presented when link with the other tools mentioned. The name of the Offenses was of great importance, as my interpreter “My audience” needs to know the name Offenses, so they are entirely in the present of the presentation. I also grabbed the year and number of occurrences for each Offense; it adds more meaning to the interpretation of the data. After downloading the CSV file from the NYPD’s website, I introduced the file to Openrefine, which allows me to clean up my dataset, removing unwanted texts, spaces, rows, and columns. I then used Excel to make minor purging, ensuring that the file is fully ready for Tableau. Once all this is finished, I imported both the excel and the precinct mapped image file to Tableau. Tableau was used as the main tool for this project. It helps the visual for the explanation of the dataset used. Using tableau as your main database set
Result set – Click any graph or chart for Interactive interface
I used tableau as my primary tool for the dataset presentation, I was able to present and explain my results visually as I will walk through the results below. I will only explained two from the Seven Major Felony Offenses reported ( Burglary and Murder & Non-Negligent Manslaughter). The other five Offenses can be retrieved the same way.
Let’s start with the mapping; This Map is based on the Longitude and Latitude generated from the image map file retrieved from the NYPD website. The Color shows details about the Precincts accompanied by the labeling of the Precinct number. Having the labeled map helps tremendously with geographical visualization of the dataset being presented.
Offense by Precinct and Year.
In Fig2, Fig2i, and Fig2ia, you will see the total count for each Offense broken down by Precinct and the total years. The color shows details about the Offense, and the data is also filtered by Year, from 2000 to 2019. The view is filtered on Precinct. I also placed the total amount of occurrences for each Offense over the nineteen years. The filter by Year and Offense is available if the user wants to see the amount by specific Year and Offense. Fig2 shows that Harassment has ranked the highest over the past 19 years with total cases 1,422,917, while Murder & Non-Negligent Manslaughter the lowest over the same period with 9432 cases. From this cumulative visual report, in Fig2i we see that over the nineteen years The 43rd Precinct located in the Bronx ( 900 Fteley Ave, The Bronx, NY 10473) has the highest Harassment cases (38,3313 ) in comparison to the (688) cases recorded at the 22nd Precinct located in Central Park( 100 86th St Transverse, New York, NY 10128) Fig2ia.
Let us look at Murder & Non-Negligent Manslaughter;
Fig2ii shows that the 75th Precinct located in Brooklyn East New York(1000 Sutter Avenue) has the highest occurrences (482) while the 22nd Precinct ranks the lowest again with only 4 cases. What could be the cause of this? Could it be ethnicity and diversity of the population?
Offense Trend by Precinct and Year
Let us look at the Offense rate trend by individual precinct and the duration of years. The line graph helps us to drill down on the data set to see which precinct has the most Offenses and whether it’s increasing or decreasing. This line graph can also show multiple Offenses and precinct simultaneously if comparative need be. Fig3 shows the burglary rate for the 1st precinct located on 16 Ericsson Pl Manhattan, NY, this graph shows that there has been a tremendous decrease since 2019. Although there are fluctuations in the burglary Offense trend, it has been steadily decreasing over the years, coming from 475 in the year 2000 to 66 cases in the year 2019.
Click Dashboard for Interactive interface
Although I had some limitations with the word count ( Max 1000), I was able to manipulate the dataset to get my points across efficiently and effectively. Choosing this topic “Seven Major Felony in New York City “was amazing, very insightful, and educational. The use of the different tools to accomplish the outcome was a great learning curve for me. Two of the four tools used were my first full encounter; Openrefine and Tableau. I appreciate the use of Openrefine because it allows me to clean my dataset up. The original data pulled from the NYPD’s website would be classified as messy data ( open source data). Openrefine allows me to strip the data-keeping only the information vital for my research and visual presentation. Tableau proves to me that it is one of the best tools for visual presentation. I was able to use charts and graphs to explain my data visually; See, analyzing a dataset is only one aspect of presenting a dataset or a statistical report. Having a clear visual presentation helps the reader to understand what the data means quickly. It is proven over and over that the human brain responds better to visual presentation than plain text. Visual stimulation over text translation allows the brain to consume the material with more consummate ease. The limitation with Tableau and my project; I wish I had more time to navigate Tableau and see how much better I could have explained my dataset visually. The other two tools used were Excel and Microsoft word, which needs no introduction.
For the future, I would be better able to analyze my visual data presentation because I would have to gain more experience with Tableau and Openrefine. I would have also included a more detailed finding and recommendation should there be no word count limitation. In conclusion, I have learned a lot about the Seven Major Felony and their trends and self-analyzed why specific areas have a higher rate than the other See Fig4. Conclusions were drawn base on the socioeconomic condition, like income rate and Ethnicity, even income rate; The felony rate in Brooklyn, Queens, Bronx, and Staten Island are higher than most parts in Manhattan. Brooklyn, Queens, Bronx, and Staten Island are more diverse in its ethnic population, while in Manhattan, most parts are not. But I thoroughly enjoyed working on this presentation.
NYPD’s official website
- Wilke, Fundamentals of Data Visualization, Chapters 21, 22, 23