Lab Reports, Maps, Visualization


Manhattan is New York City’s most famous and popular borough. In fact, most people believe Manhattan to be a synonym of New York City. Likely so, Manhattan would be the first place any out-of-towner would want to visit. Although New York consists of 5 different boroughs, Manhattan has gained the most popularity. The up and coming boroughs of New York such as Brooklyn, Queens and Bronx are still on their path to gain popularity and attract the kind of crowd Manhattan does. Through this project, I would like to explore the places of intetrest in New York City as seen by the locals distributed across boroughs.


As we were given this task of creating a map, I explored the various data sets from various provided sources. I found that to work with a set of data relating to New York City would be interesting. It caught my attention and explored more visualizations with respect to the same. I then came across few visualizations that intrigued me. Shown below is a historic map of manufacturing industries across the city. It was published in 1922 using census data from 1919. The colourful depictions of the city’s areas gave me a starting point as to how I would like to begin working on my visualization.

Historic Map of New York’s City Manufactring Industries

Although this historic map had my attention but a data set about city’s manufacturing industries was not the one I was completely interested in. I further explored and came across Locals & Tourists an intercative map created by Eric Fischer. He collected tweets across the five boroughs and beyond to determine what areas were most concentrated with tourists and what areas were dominated by locals. The tourists are depicted in red while locals are depicted in blue.

Locals and Tourist by Eric Fischer

From the above visulaization, I thought about if there were any data sets which provided information about places of interest in New York City as rated by the locals. Hence, the two maps above gave me a direction in which I would like to proceed.


1. Finding A Data Set

In order to work in Tableau, create a visualization and understand the points of interest in New York City, an appropriate data set was needed. Amongst various data sources that were provided, I found the data about points of interest on NYC Open Data . As, I browsed through the data sets available on NYC Open Data, points of interests in New York City interested me and thus I chose to work with this data set. Being one of the most popular cities in the world, I was intrigued to see what the different city agencies consider to be common places of Interest.

Points Of Interest: NYC Open Data

The data was available in multiple formats however, since a map had to be created, the data set was downloaded as a spatial file. Since these data points only depict the points of interests in New York City, a second data set had to be used in order to distinguish the common places area wise. Hence, a data set explaining borough boundaries was needed as well. The data set was found on NYC Open Data depicting borough boundaries.

Borough Boundaries: NYC Open Data

The data was available in multiple formats however, since a map had to be created, the data set was downloaded as a spatial file.

2. Working With Tableau

2.1 Importing Data

The downloaded data set which was available in the form of a spatial file is supported by Tableau and can be directly accessed. In order to map the two data sets as layers, they had to be combined.

A union was formed betweenthe two data sets. They were joined using a common column name. The relationship of which is Borough = Boro Name.

2.2 Creating Visualization

Once the data sets were imported and combined, the visulaizations were created by layering the data points. Data points of borough boundaries were listed and geometry along with latitude and longitude were loaded on the sheet. This gave an idea as to the different boroughs present in NYC. I then added Boro Name on colour as well as labels in order to visually differentiate it. This resulted in the following visualization:

Borough Boundaries

Data points of places of interest were listed and geometry along with latitude and longitude were loaded on the sheet. This gave an idea as to the common places of interest in NYC and resulted in the following visualization:

Points Of Interest

The next step in the process was to combine the two data sets in order to understand the points of interest with respect to the boroughs of New York. This was done by using the geometry and adding it as layers.

Results & Discussion

After comibining the two data sets, the following visulaizations were formed:

The idea behind the two visualizations was to understand the how are the places of interest are distributed over the various boroughs of New York. As it can be seen from the figure above, Manhattan has the highest number of places followed by Brooklyn and Bronx. After creating this visualization, I was intrigued to know how many places of interest does Manhattan have as it appears to be the highest. This was done using a lasso tool. I selected the borough, which is Manhattan, and with the help of a lasso tool selected all the data points which lay in Manhattan.

After selecting, all the points, a set was created lablled ‘Manhattan’ for ease of understanding. It was then added as a filter so that it displays the desired data points. Boro Name were added as a colour and label to distinguish the borough’s and provide the users with the names of common places of interest.


The biggest challenge for this task was finding an appropiate data set. Quite a few times, I had to discard the data sets because they would not result in the desired output. What made it challenging was looking for two data sets with a common relationship. Many a times, even if the data sets, had the relation, I would not know what I am looking at. However, I was able to narrow down and breakdown the problem and created visualizations accordingly. In reference to future scope, I would like to filter the points of interest and see if there are any overlaps. Probably even group them together in order to create fewer data points.