Eviction rates in New York City


Visualization

Introduction

New York City is one of the most popular cities in the world, with residents from all walks of life building their lives in it. It is a favored location for many to work, settle and study, given the plethora of opportunities that the city offers. Evictions are a very real part of the housing scene in NYC, and are a regular occurrence due to a variety of reasons, the most common being the inability of tenants to pay their rent on time. This project aims to visualize the rates and distribution of evictions across NYC and examine eviction trends both over time and across areas. Additionally, the relationship between eviction rates and certain chosen factors that may affect eviction rates, such as income, have been visualized and examined.

Method

Dataset:

I used two datasets to create the visualizations constituting this project. The first dataset was the Evictions dataset that provided data regarding evictions in NYC, including dates of eviction and locations. The second dataset was the Neighborhood Financial Health Digital Mapping and Data Tool dataset, constituting data regarding median income, racial distribution and poverty rates. Both these datasets were obtained from the NYC Open Data website, an online platform for free public data published by New York City agencies and other partners.

Software:

I used R and Microsoft Excel to clean the data and consolidate data points at a later stage in the project. All visualizations in this project were created using Tableau Public. 

Process:

I downloaded both datasets utilized for this project as CSV files, and loaded them into R to take a look and clean the data if necessary. Apart from a few null values, the datasets were very clean and did not need any further modification. After this initial review, I loaded the datasets into Tableau and connected them to each other using the columns specifying borough name in each dataset. Some issues occurred during the initial stages of visualizing the data since the datasets were either not connected accurately or not connected at all. However, I was able to rectify the issue fairly easily with troubleshooting. Once the data was ready, I grouped related visualizations together and created multiple dashboards displaying various aspects of eviction rates that I was attempting to visualize. I also created a new dataset with core data points retrieved from the two datasets I was using, in order to simplify the visualization process when incorporating data points from the two different datasets together. I then conducted user testing sessions on each of the dashboards to evaluate how well they represented the information I was intending to convey. These sessions led to certain design changes being implemented in some of the dashboards, ending with a final version of each.

User Testing:

I conducted two user testing sessions in order to evaluate the effectiveness and readability of the visualizations created. Both sessions took approximately 15 minutes to complete, and were conducted in person. As part of these sessions, I simply displayed each dashboard to the participants and asked them to explain what they felt that the dashboard was conveying. I offered no explanation as to what my topic was or what each dashboard was attempting to convey before I received feedback, in order to get an unbiased and raw opinion. These sessions proved to be very informative, and provided me with many suggestions for improvement that I later carefully considered and implemented the suggestions that I felt worked best. The final visualizations are an amalgamation of my own design choices and the feedback from my user testing sessions.

Design Rationale

For better understanding, I have divided the explanation of my design rationale according to each individual dashboard.

Dashboard 1: Eviction rates in NYC

View dashboard here

This dashboard contains a map representing the distribution of evictions across NYC, along with a comparison of the eviction rates between different boroughs. My first choice to visualize the distribution of evictions was to use a map with markers colored according to the borough, and I stuck with this choice for the final visualization. I chose a dark background for the map in order to allow the markers to really stand out, and I achieved this effect easily due to the bright colors of the markers. The default markers that were used in the first visualization were solid circles, which made it difficult to see the overlap between markers and therefore reduced the ability to discern density of evictions in certain locations (Figure 1).

Figure 1

In order to remedy this, I changed the marker settings to ‘shapes’ instead of ‘circles’, so that the markers would not be solid shapes and the overlap would be a little more clearer (Figure 2).

Figure 2

This change made the density of evictions more discernible as intended. The map shows that eviction occurrences are concentrated in certain parts of lower and midtown Manhattan and one area in Brooklyn, and are very sparse in Staten Island. Other than these areas, eviction occurrences seem to be well spread out.

The second visualization in this dashboard displays the number of evictions in the city by borough, to see if there is a higher occurrence of evictions in any particular borough in comparison with the others. I tried a heatmap to represent this data, but felt that it did not accurately represent minor differences in the number of evictions (Figure 3).

Figure 3

I attempted changing the colors according to the borough to see if that reflected the data better. This seemed to help a little, but I still did not find it clear enough (Figure 4).

Figure 4

My final option for this visualization was to create a bar chart representing the data, with the bars colored according to the borough (Figure 5).

Figure 5

This was the simplest option but also the most readable and clear option, leading to it being the final visualization of this aspect of the eviction data. The chart shows that the Bronx has the highest number of evictions so far, closely followed by Brooklyn. Queens is next in the order with a significantly lower number of evictions followed by Manhattan. Staten Island has the lowest number of evictions, with the number being almost a tenth of the number of evictions in the Bronx.

Feedback from user testing sessions: Both participants understood what the dashboard was conveying easily, and had no issues interpreting the visualizations.

Dashboard 2: Residential vs Commercial Eviction Rates in NYC

View dashboard here

This dashboard displays the distribution of commercial and residential evictions across NYC, along with a comparison between the number of commercial and residential evictions according to the borough. As with the first dashboard, I chose to use a map to represent the distribution of evictions across New York City with markers colored according to the type of property (commercial or residential) (Figure 6). Due to my experience with the first map, I selected ‘shape’ markers for this visualization for the very first version and did not make any further changes. This decision was especially impactful in this case since there was an obvious imbalance between the number of commercial evictions and the number of residential evictions, with the number of commercial evictions being far fewer. If the markers had been solid shapes, the markers representing commercial evictions may not have been visible at all in some areas. 

Figure 6

This map showed that residential evictions are much more in number than commercial evictions. Commercial evictions do not seem to be concentrated in any particular area, and are widely spread across the city.

I tried out various bar charts to represent the number of commercial evictions vs the number of residential evictions according to the borough. I first tried stacked bar charts which were clear in terms of depicting the proportion of the number of commercial evictions vs the number of residential evictions, but were not the easiest option when it came to interpreting the actual numbers (Figure 7).

Figure 7

I then tried side-by-side bar charts, but felt that the visualization could be a bit more comprehensive rather than requiring continuous comparison between two charts (Figure 8).

Figure 8

My final visualization was a more comprehensive version of the side-by-side bar chart, where the commercial and residential bars were grouped according to the borough, rather than the bars for each borough being grouped according to property type (Figure 9). The presence of fewer bars in each group increased the ease of interpretation, leaving less information for readers to recall.

Figure 9

This graph shows that the number of commercial and residential evictions are most comparable in Manhattan, while all the other boroughs have an almost negligible number of commercial evictions when compared to residential evictions. This could be attributed to the fact that Manhattan is the most commercially populated borough in New York City, providing more opportunities for commercial evictions to occur.

Feedback from user testing sessions: Similar to the testing sessions for the previous dashboard, the participants did not find any issues with understandability or clarity while viewing this dashboard and were able to interpret it accurately with ease.

Dashboard 3: Eviction Trends and Patterns Over Time

View dashboard here

This dashboard contains visualizations depicting changes in eviction rates over time, and changes in monthly occurrences of evictions. To depict trends in eviction rates over time, I chose to group evictions by year to establish a significant pattern. I first tried a heatmap to depict this data, but as with most of my previous attempts to use a heatmap, I did not feel that this was the most effective or impactful visualization to represent time-related data in this case (Figure 10).

Figure 10

I then tried a line graph, setting the thickness of the line to adjust according to the count of evictions (Figure 11). This clearly visualized the rise and fall of eviction numbers over the years, with the thickness of the line adding to the impact of the visualization.

Figure 11

This graph shows that there was a sharp decrease in the number of evictions in 2020, which could possibly be due to the pandemic. There has been an increase in the number since then, after which there is a drop again in 2023, though this is most probably due to the fact that data for the whole of 2023 has not yet been collected. The pattern observed till 2022 does indicate that the number could potentially increase further in 2023.

The second visualization in this dashboard was created to see if there are any months in particular that see significant changes in the number of evictions. I decided to use a bar chart for this one, and retained that decision. In my first visualization, the bars were colored according to the month (Figure 12). This was later changed according to feedback received from one of the participants of my user testing sessions.

Figure 12

Changes made according to feedback from user testing sessions: One participant was confused by the coloring of the bar chart, saying that it made them assume that there was some other data point that was influencing the chart. I therefore removed the coloring to emphasize the absence of additional data points apart from the months and the number of evictions represented in the visualization (Figure 13).

Figure 13

This graph shows that January sees the highest number of evictions while December sees the lowest number, almost half as much as January’s number. All other months see mostly similar numbers of evictions. 

Feedback from user testing sessions: Other than the concern raised about the visualization depicting eviction rates according to month, no other issues were faced by either of the participants, and they were able to interpret the visualizations with ease.

Dashboard 4: Income and Poverty Rates vs Eviction Rates by Borough

View dashboard here

This dashboard contains visualizations attempting to draw a parallel between economic factors and eviction rates in each of the boroughs. The first visualization depicts the median income of each borough in relation with the number of evictions in that borough. I used color gradation to represent eviction rates, with darker shades representing higher rates of eviction and vice versa (Figure 14).

Figure 14

The chart shows that Staten Island has both the least number of evictions and the lowest median income. This could be due to factors such as lower population density in comparison to the other boroughs. The Bronx has low median income but the highest number of evictions, which could be taken as a significant indicator of a relationship between the two factors. However, Brooklyn has a high median income but a high number of evictions as well, which shows that further investigation into additional factors is required before a concrete conclusion can be reached.

I created a similar chart to compare poverty rates with the number of evictions according to the borough (Figure 15).

Figure 15

The chart shows that Brooklyn has the highest poverty rate, closely followed by the Bronx, and both boroughs also have high numbers of evictions. Queens also has a moderately high poverty rate, and has the third highest number of evictions. Staten Island has the lowest poverty rate and the lowest number of evictions. There may be a relationship between these two factors, though further research is required to establish a concrete relationship.

Feedback from user testing sessions: Both participants had no difficulties interpreting the visualizations on this dashboard. One of the participants felt that the stepped colors in the key gave a better understanding of how to use it, due to which I changed all the keys to stepped colors instead of gradient colors accordingly.

Dashboard 5: Race vs Eviction Rates by Borough

View dashboard here

The final dashboard visualizes populations of different races in relation to eviction rates according to borough. I utilized a similar approach to the previous dashboard, where I used color gradation to depict an additional data point. I first gave each bar chart a different color to distinguish each race (Figure 16).

Figure 16

The final visualization for this dashboard follows the same format, but incorporates some feedback from the user testing sessions conducted.

Changes made according to feedback from user testing sessions: Participants expected all the bar charts to be in the same color since they were depicting similar data. Both participants also faced confusion as to what the color gradation was representing, since eviction rates are the major focus of this project but the color gradation in this visualization initially represented the percentage of population belonging to each respective race. They added that it was confusing in this particular dashboard and not the previous one since there are fewer charts on the previous dashboard with no scope for comparison between them, unlike this one, due to which a consistent color scale across the entire dashboard would help them understand it better. I therefore incorporated both changes by changing all charts to the same color, and changing the data on the y-axis to the percentage of population, so that the color gradation represents the number of evictions (Figure 17).

Figure 17

These charts show that the Bronx has the highest percentage of Hispanic population and the highest number of evictions, Brooklyn has the highest percentages of Black and White populations and also has a high number of evictions and Queens has the highest Asian population with a moderate number of evictions. Manhattan does not have the highest percentage of any race, and has a moderate number of evictions showing that it has a very diverse population but no particular link between race and number of evictions. Staten Island has very low percentages of population belonging to each of the races, and also has the lowest number of evictions. The only definite conclusion that can be drawn from these charts is that Asian populations are not affected by high eviction rates as much as the other races, but other links between these two factors need to be researched further before any other general conclusions can be made.

Reflection

Overall, I really enjoyed working on this project. I was able to incorporate some of the feedback I have received over the course of this semester into this project, regarding better ways of handling the data I am visualizing and going with simple designs when they work better than more complicated ones. I especially liked researching how my visualizations were perceived by users, since I got a lot of insight into how different people have such unique perspectives about the same thing. The data that I worked with for this project really pulled me in, and I am very interested in exploring this data further along with other related data to find out why I got the results that I did. In future, I would definitely try to incorporate more data points into these visualizations to articulate the reasons for different outcomes. Additionally, I still feel that there might be more options that might work better in place of the visualizations in my final two dashboards, and I would like to explore that as well. All in all, I am happy with the visualizations that I came up with for this project, and I look forward to taking these skills further in the future.

References:

https://data.cityofnewyork.us/Business/Neighborhood-Financial-Health-Digital-Mapping-and-/r3dx-pew9

https://data.cityofnewyork.us/City-Government/Evictions/6z8x-wfk4

https://opendata.cityofnewyork.us/data/

https://gothamist.com/news/nyc-eviction-rate-continues-to-rise-since-ban-was-lifted-as-homelessness-surges