Small City, Long Lines – Mapping of Shenzhen and Hong Kong


Lab Reports, Maps, Visualization

Introduction

Cities are known to be a very busy and complicated in many ways. Population density, transportations, traffic, and crime are often useful information that would be very helpful if we are able to visualize it on a map. In this study, we will take a look at what we can do with some datasets that focuses on two cities in Asia: Shenzhen, China and Hong Kong.

Inspiration

I participated in a small research project back in 2018 with a professor that focuses on Shenzhen smart city improvement. Back then I had no knowledge on softwares like Tableau or Mapbox, I created a map that shows the 1 day coverage of traffic in Shenzhen using Java Processing. However, to graph such map require quite a bit of coding and data processing to calculate the traffic volume on each road segment, and to do such data processing was very time consuming.

Figure 1: Traffic Density Map using Java Processing

So in this study, I would like to do the same or similar maps using Tableau. With the help of shapefiles and built-in functions in Tableau, this process should be a lot more simpler.

Datasets/Roadblocks

So first, we would need a dataset that has all the vehicle or traffic data located in Shenzhen. I am using the  Electric Vehicle Data that contains Electric Taxis GPS sample for 1 day provided here by Desheng Zhang, a professor at Rutgers University.

To create something similar to Figure 1, we would also need all the roads and streets information of Shenzhen. However, I am not able to find the original dataset that I had anymore, and had this dataset somewhere else. The problem is, this dataset was actually really hard to find, and the one that I found was not accurate at all.

Figure 2: Shenzhen road network map
Figure 3: Shenzhen road network map zoomed

As you can see Figure 3, the map does not line up with the built in map in Tableau. In other words, the coordinates of the whole dataset is off. In Figure 4, I tried to fix the dataset by using a software called QGIS, which allows you to view and modify shapefiles. I was able to move the lines roughly back to its actual place, but a lot of the lines still do not line up with the actual location.

Figure 4: Attempted to fix the Shenzhen road network shapefile

So as a result, I had to abandon this dataset as it’s not usable. Since the road networks are all line segments, it’s difficult to calculate the intersection between the traffic GPS points and the inaccurate road lines.

Next dataset that I looked at focuses on Hong Kong, which the datasets are much more accurate. The dataset are all obtained from https://opendata.esrichina.hk/. Figure 0 is the road network map in Hong Kong, and later on in this study we will take a look at some of the maps that we create using datasets provided from this website.

Softwares

For this study we will mainly be using Tableau to create our map visualizations. QGIS was also used to fix the Shenzhen data, but we won’t be focusing on that as we are not using the dataset anyway.

Process

We will first create all the basic maps first, and then we can add in datasets that we want to see and join it to the spatial data using built-in Tableau join function. Unfortunately, I was not able to find a dataset that is similar to the traffic data from Shenzhen. Hong Kong has traffic volume detectors only in several locations rather than throughout the whole city, so using the road network to compute traffic density is not very useful.

Results

Figure 5: 18 district boundaries of Hong Kong

Figure 5 shows all 18 districts in Hong Kong. This dataset actually includes the water bodies as part of the district area, which makes it a little difficult to read. I turned down the opacity to 60% and use “Streets” as the base map to make the map a little more readable. I also added labels to identify each of the districts.

Figure 6: Tableau joining two datasets.

Next, we can join the some other datasets to the geometry of the 18 districts and count how many crimes happened in each of the districts. We are using “Left Join” here because we still want to see all the districts even if there are no crimes at all in that area, and we don’t want to see any crimes that didn’t happened in the 18 districts, which in a way also cleans the data through this process.

Figure 7: Result crime data map in 18 districts of Hong Kong

In Figure 7 we can see that Yau Tsim Mong and Kowloon City has the most crime happened during the time October 2016 – April 2017. I have chosen a scale in red to make it more eye-catching. The scale is also separated into 7 sections to distinguish different sections more.

Figure 8: Parking meters in Hong Kong 18 districts.

Next, we can check the number of parking meters in each of the districts in Hong Kong. We can see here Yau Tsim Mong is again the one with the most parking meters.

Reflection & Future Direction

So just like figure 7 & 8, we can using different sample datas such as bus stop locations, EV charging station, etc and join it to the district datasets to measure the volume in each district. Ideally we can also do the same to the road networks, but I have learnt that we need to have more accurate datasets to do so. Perhaps there is a way to calculate the join between the inaccurate datasets with a little offset range added to the latitude and longitude.

In conclusion, using Tableau to create these Map visualization is much easier than using Java to process, calculate and draw the map. In future work, I definitely want to try doing some more complex maps using these dataset, especially with the road network datasets.

Citation

Wang, Guang, Xiuyuan Chen, Fan Zhang, Yang Wang, and Desheng Zhang.
Experience: Understanding long-term evolving patterns of shared electric vehicle networks.
In The 25th Annual International Conference on Mobile Computing and Networking (MobiCom) 2019.