Miami-Dade 311 Complaints


Lab Reports

Introduction

When I began looking for a dataset, I wanted to look for something that was of personal interest to myself, and a topic that I somewhat understood. Therefore I started out looking for open data sets on Miami-Dade Counties government website https://opendata.miamidade.gov/browse. I was pleased to find that the county has a variety of open-data sets in a catalog that ranges in subjects. For my dashboard I choose a data set that contained 311 service requests activity for Miami-Dade Counties. Initially I didn’t have questions for my data, though once I started plugging data into the Tableau Public, I determined that my data would address these three things:

  1. Which Miami-Dade County/District generates the most 311 calls;

2. The percentage of county complaints per complaint topic;

3. How many 311 complaint tickets are created and closed in the years represented.

Overview

The 311 Contact Center is a public service line created by Miami-Dade County in which you can get  one-on-one personal customer service in various languages. Individuals are able to report neighborhood problems and code violations along with relevant materials such as photos, paperwork, etc. The 311 service is also available for contact via Twitter, mobile application, and email; along with a couple of walk-in services.

The Miami-Dade County website has what I found to be a pretty nice interactive open-data platform. On the page there is a taskbar that allows you to manage, see more views, filter, visualize, export, discuss, and embed the data. There is also an “About This Data” view in which you can see how many times the data has been visited, downloaded, and other information about the data set.

Creating a visualization on the site allows you to see the data in a calendar, map, or chart, and you can choose which columns you want to be displayed and how. Within the charts there are a variety of different options to use much like Tableau Public. From the sites platform you’re also able to play around with other factors such as the colors and point size. I didn’t do too much using the site, but I think it’s helpful that they make these tools readily available on the site for anyone to use. To download your vis though, much like any other platform, you need to create an account.

 

Method / Results 

Originally the 311 data set had 23 columns, when cleaning the data using Microsoft Excel, I decided that I didn’t need the columns such as Ticket ID, Street Address, State, zip code, ticket status, X and Y coordinates, and other mapping data. I also didn’t use the Issue Description because the field was blank in all the records. The Case Owner field and Case Owner Description fields were duplicates, therefore I was able to delete that column also. What I was left with was the Issue Type, which said what each case issue was pertaining to, and Case Owner, a column dedicated to identifying which department is responsible for handling the case. I chose to work with City column and District column, because I hoped to be able to use that as a quantitative measure. I also choose to work with Ticket Created and Ticket Closed, because I hoped to be able to show a visualization that would illustrate the average time it takes to complete particular Issue Types.

Once I’d gotten some understanding as to how to work Tableau, it was a little stressful trying to figure out how to exactly make my data read what I wanted it to read. After listening to some of the Help instructables I realized that starting with a question is best to figuring out what exactly I wanted my visualizations to say. Once I started curating questions, my visualizations started making some sense. I was able to create themed groups within my data especially within the Issue Description field, so that the plotted information would be more condensed and make sense, I was also then able to determine some of the problems with how the data was collected, and also waysthat the other columns could’ve come in handy for understanding my data more.

I ran into an overall problem with how my data was collected by the city. This played a role in my visualizations and what and how information could be visualized. For the first sheet titled “311 Complaints by District/Neighborhood” I attempted to answer the question of which district complains the most about a particular issue. For instance the first bar graph shows the issue titled Animal Services, from the bar graph, you can see that district 9 complains more than any other district about animal services. As you can see, roughly 12.8% of the complaints from district 9 are regarding issues with animal services. I moved the dropdown menu closer to this vis so that users know that they can change the issues to see different maps. I think this particular map would’ve read better without percentages on the y-axis as my inspiration 1 (fig.4) visualization shows, using a standard number count is easier to interpret. I feel that I’d like to know how many complaints are being made in District 9 about animal services. Another factor is that I’m no expert in knowing what district are what in Miami, therefore my next map was an attempt to clear that issue up. The next map is also where I found the inconsistencies in the data collected by Miami-Dade County, and I realized that maybe working with another data point could’ve cleared this up.  

fig.4 Inspiration map 1, using standard numbering on the y-axis to show fruit preference (I believe)

On sheet two titled “Which County Complains the Most,” I wanted to show a map that would illustrate the districts, and give more insight to which neighborhoods are related to which district. That is also why I put this vis on the top left, because I thought it was important for the reader to be able to understand who the districts are, before looking at what their top issues are. I was assuming that the data would be clustered and would show one or two neighborhoods per district. For instance North Miami and North Miami Beach I was thinking would be the same district. Though from the color clusters, you can see that some neighborhoods share more than one district. Figuring out which neighborhoods reside in which district is a project on its own, as there doesn’t seem to be an obvious map to understand the districts, and researching and regrouping would be very time consuming. This was not the only problem I discovered using this visualization. As one can see from this visualization, Miami-Dade County has the most complaints overall. Miami-Dade County though, is the overarching county of all the neighborhoods. Therefore this means that when the data was collected, the proper neighborhood wasn’t connected to the ticket; and this is why the district numbers are repeated in several places. What works about this style of visualization and why I choose to use it, is the clusters of colors and the hovering feature that makes it easy for the user to see the groups/concentrations of color that represent the amounts of complaints, in regard to which neighborhood. A prime example of proper clustering would be my second vis inspiration map featured in fig.5, in which the data for each particular year is clustered into one color. If I were to redo this map, I’d consider using the zip code column that I removed during data cleaning, as that would be more finite (for lack of a better word)than using the districts which seem to be less deciphered. I think using the zip codes would result in clearer clusters, but thatwouldn’t resolve identifying the different districts.  

The next visualization I wanted to show was one that would present the differences between how many tickets were created and closed in particular year. What I would have liked to have done  was create weighted circles in two different colors, one color that would represent the number of tickets created, and one color to represent the number of closed tickets. Then overlay the amount of tickets closed over the created ones (since the amount closed is lesser that those opened). This would show a direct comparison to the tickets created and closed, and also show that Miami-Dade County is doing a decent job closing out the created tickets per year. What I was able to do though, is create two plot charts above one another. I choose to show the number or records on the y axis so that viewers could see how many tickets were open or closed per year. Then directly under, the tickets created visualization so the viewer can lookdown and see the amount of tickets closed that same year. I think this would be clearer on one map, verses two. Kind of like my third inspiration (fig. 6) in which you’re able to see various information at different points, represented though the colors and circle sizes.   

Future Direction 

As most of my visualizations are illustrating amounts of issues complained, for that sake of consistency I choose to use the same color to represent those amount. If I continue to work with this data set, I think there are more visualizations that can bring this data to life. Such as a line graph averaging out the amount of time it takes between opening and closing a ticket. I can also see this data mapped out to the county, and users can hover over the districts, click in and see more details pertaining to where particular complaints are coming from. These visualizations can help the city filter if particular areas are affected more by one issue than another, and help the county find ways to solve that problem. Here is the link to view my visualization: Miami-Dade 311 Complaints  Â