Visualizing Citi Bike data on maps


Lab Reports, Maps

As the nation’s largest bike-share program, Citi Bike helps people in New York City to move around in a fast, fun and affordable way. On my previous report Citi Bike in April, I received feedback stating that the heatmap should be removed from the dashboard and displayed on its own as an epilogue. Therefore in this week’s Carto lab, I tried to visualize that information on a map, finding a more compelling way to illustrate how Citi Bikes were rented during April.

Inspiration

The interactive visualization A Month of Citi Bike focused on the exact same topic with slightly different data. The visualization can play automatically at a fast or slow speed, but the users are also allowed to pause the visualization at a specific moment for a closer look. The background color changed from white to dark grey as the time shift from day to night, giving the users a more intuitive way to recognize the time. The weather and precipitation give the users additional information regarding the usage of the bikes. However, as the stations were built quite dense in some areas, it is a bit difficult to hover and acquire information for a specific station.

Process and Result

The 1,766,096 rows dataset I used was downloaded from Citi Bike’s own database. As I had used the same dataset before, I am pretty familiar with it. It contains the exact start/end time, start/end station, as well as longitude and latitude of each station of every record in April, which would allow me to visualize the data spatially using Carto, an open-source cloud computing platform for users without GIS or development experience to analyze spatial data. 

On and off for day and night

After uploading the CSV file into the dataset, I used the built-in analysis tool GeoCode to define the parameters. It brought me to a map with all the stations as red dots scattered over Manhatten, Brooklyn, and Queens. As I mentioned earlier, what I wanted to visualize was how those bikes were used along time. To make a time series visualization, I chose animated in aggregation and set the start time as its reference. The red dots started to be turned on and off, demonstrating each record over time. However, in some areas where stations located comparatively close to each other, the circles got overlapping. So I switched to heatmap instead of points and reduced its radius to 10. I also extended the animation to 60 seconds and increase its step to the maximum to ensure the details could be seen. The last step of the project is adjusting the background to Positron Lite, allowing the visualization to stand out from the decreased saturation map.

The map flickered every 2 seconds, indicating the decreased usage of bikes during the evening. But the pause seems to last longer at the weekends, implying users tend to use the service later compare to weekdays. During the day, the dots in Manhatten downtown and midtown stayed illuminated while in the other areas they flashed, meaning users were more active in the lower Manhatten than Upper Manhatten, Brooklyn and Queens. You might also notice a quick dim during the day, which is the off-peak hour on a weekday when fewer users utilized bikes.

After finish the first illustration, I started to explore what else I could do with Carto. Unfortunately, I ran into errors for most built-in analysis as my dataset was gigantic. However, by using Subsample Percent of Rows, I was able to reduce the data to its 25% and conducted some analysis. I chose by Hexbins in aggregation and operated it by count. From the final illustration, we could see that most of the bikes were activated in lower Manhatten, which supported the conclusion of my last experiment. In addition to that, I also ran a few analysis on user type, gender, age, and trip duration, from which I got to the conclusion that most users are male and subscribers, but the user age and their trip duration are pretty diverse across the data. The last step is overlapping all the maps I made in photoshop and created a short animation.

Reflection

In general, I think Carto is very easy to understand and use, as I had some experience with GIS. The data importing and map creating process is very intuitive, and the users can try out different built-in analysis tools to see the result. For future development, I think it would be interesting to visualize some of the trip routes, for example, the top 10 longest and shortest trip users took, on a map, and combine it with metro stations and tourist attractions. Or perhaps dive into a specific user’s portfolio to see how he/she use the service.