Spending a semester exploring information visualization tools like Tableau and Gephi, taking inspiration from best practices for visualization design, and brainstorming with peers provided plenty of ideas for this final porfolio. Using data from my day-to-day work in libraries, I got acquainted with Tableau charts and graphs capabilities. I got to know Tableau further and Tableau got to know a little more about Long Island with our section on mapping. Working with network data is what I found the most interesting out of our semester of work. Returning to each of these projects – or each tool – has been a happy reunion.
Guiding my revision ideas with comments from my peers and my professor, I wanted to create a map or series of maps that could be presented to potential visitors and tourists to Long Island Wine Country. Limiting my audience allowed me to focus further on the details that would be most useful when trying to plan a vineyard/winery tour. A tricky aspect of revising these maps was determining what data was useful instead of just noisy. Although I sought out other spatial data related to transit usage and traffic patterns, this data either wasn’t readily available or wasn’t useful when building a map for tourists. As a result, I chose to do as much as I could with the data I’d collected up to this point (vineyard locations, bus stop locations, and agricultural attractions). I delved deeper into Long Island vineyards by creating a network dataset based on the spatial analysis done when mapping these vineyards using Tableau. Although initially unsure of the feasibility of this option, I explored this option further to satisfy my own curiosity about the results.
The maps I made representing Long Island vineyards and tasting rooms in proximity to bus stops were my starting point for creating a network dataset. Nodes represent each vineyard location and edges represent locations within one mile of each other. I was able to determine relative distance between locations because I had generated half-mile buffers around each location; if buffers around two locations overlapped, a connection (edge) between two locations (nodes) was recorded. Studying the map and translating spatial proximities to connections was a tedious, hands-on process, but manageable because I only had 54 locations. I tested my network dataset while I was still identifying connections (pictured below) to make sure that I wasn’t completely wasting my time.
Testing my dataset gave way to the long process of playing with it in Gephi to create a network that made sense. With only 54 nodes and 70 edges, this was a much less complicated network as compared to my past experience working with network data. My goal was to create a network that emphasized the clusters and could call attention to locations that are, theoretically, within walking distance (i.e., one mile) of each other.
By running every statistical report I could using Gephi, I generated additional data about these vineyards I could use not only with Gephi but also with Tableau. I exported this data from Gephi as a .csv and uploaded it to Tableau Public. My goal for this stage of this project was to evaluate what underlying nuances are better assessed via Tableau than using Gephi. For example— coloring nodes according to HUB or betweenness centrality values, as represented below, and other data fields with varied value ranges were tough to get right using Gephi. Compare this network representing HUB values; brighter, more vibrant colors reflect nodes with the highest values, node size reflects degree.
And this treemap:
Consider this network and this graph, both representing degree and betweenness centrality within this network. Again, brighter, more vibrant colors reflect nodes with the highest values, node size reflects degree. Keep an eye on Coffee Pot Cellars. Here’s the network:
And now the treemap:
Although both graphics are generated from essentially the same data, the Tableau visualizations provide decisive comparisons of underlying statistics than the Gephi network representing that same data. Maybe this is my own personal taste for treemaps. There’s just something about how clean the whole presentation is— if the format works for the data, that is. Still, working with this dataset in both Gephi and Tableau pushed me to think about color as a unifying element between visualizations produced with these tools.
I drew inspiration for coloring from oenology itself, referencing images like this “color of wine” guide from WineFolly.
I took some time to consider forms of colorblindness that might affect a user’s ability to understand the coloring. For example, I used Coblis to simulate deuteranomaly on the same “color of wine” from Wine Folly.
The rosé and red wine palettes provided the most discernible range of shades, even if the hue or saturation of the color was not the same. Informed by this, I chose a color palette inspired by shades of red wine, as seen below, was what I used in my networks and charts related to Long Island Wine Country.
I refined my maps by using the understanding gleaned from working with vineyard data in Gephi and again in Tableau. Instead of trying to represent spatial analysis (i.e., buffer areas representing a half-mile radius from each vineyard), I focused instead on sections of Long Island Wine Country in which locations are within “walking distance,” (i.e. a mile) of each other. I adjusted the underlying base map during the revision process in order to balance certain geographical contexts. This was pleasant work around for my initial idea of creating a custom Mapbox base map. While interesting, I wanted to explore other, potentially more impactful options for highlighting nuance within my vineyards dataset. Adjusting the base map allowed me to focus on other aspects of design, like adding custom icons to represent vineyards, bus stops, and other locations.
Using a “washed out” satellite map provided the balance I wanted between geographic context and readability of data. The satellite map highlights county and state roads, which happen to be the roadways that most vineyards are located along. Local public transit travels along Route 25; after all, there is only one bus route on the North Fork of Long Island. The clear visibility of Route 25 on the base map makes the bus route centerline data redundant and distracting. By removing the bus route centerline, there’s less distraction on vineyards, bus stops, and other attractions. This made the use of custom symbols more effective in the most recent iterations of these maps.
Use of the satellite base map has the added advantage of highlighting the agricultural character of the North Fork of Long Island. Combining the satellite base map with realistic symbols felt thematically appropriate. I created the symbols for vineyards from a plate depicting Cabernet Franc from the L’Ampélographie Viala et Vermorel. The bus symbol is an actual Suffolk County Transit Authority bus (although that level of detail is not as especially apparent at their current sizing). Farm stand, flower market, and oyster market location symbols were sourced from CleanPNG and PNGRepo. By uploading custom symbols to represent vineyards, agricultural attractions, and bus service, it’s easier to understand what is located where without the need to reference a legend or guide.
The experimentation I did with this data in Gephi and Tableau is what led me to my final set of networks and maps. Instead of trying to represent the entire region in one map, I created more detailed maps of the clusters identified while working with Gephi. The first is an overview of Long Island Wine Country (i.e., the North and South Forks of Long Island). The next three sets of visualizations are maps and networks representing three clusters that became prominent while exploring vineyard network statistics in Gephi and Tableau. All maps are formatted according to the descriptions offered above; all networks represent subsections of the larger network showcased earlier. Nodes are sized based on degree and colored based on betweenness centrality relative to the whole network, with brighter colors representing the highest values.
The first cluster is located in Jamesport:
The next cluster is located in Cutchogue.
The last cluster is located just east of Cutchogue; what’s most interesting in comparison to previous clusters is the location of bus stops in relation to vineyards in Jamesport and Cutchogue proper. Although there’s more vineyards concentrated in Cutchogue East as compared to Cutchogue proper and Jamesport, public transit is not as centrally or closely located in Cutchogue East.
What is intriguing about this set of visualizations are their commonalities despite the differences in format, data fields, visualized, etc. Although these visualizations represent vineyards in different ways, taken together they complement each other. The networks offer insight to proximity and spatial relations that mapping doesn’t quite capture. The maps make these places real, confirming the proximity outline in the networks in much less complicated way – instead of parsing overlapping buffer zones, users of these maps can focus on each individual location.
This work is an excellent foundation for work I hope to pursue in the future. A theme I would like to develop on my own in the future is layering crime data related to traffic crimes, particularly DUI incidents. The piecemeal aspects of data collection and reporting on Eastern Long Island complicates the matter of reporting a data-driven narrative, but I feel compelled to call attention to this point of inspiration to my work. I focused on the data right in front of me and worked on ways I could expand my understanding of how to create and refine datasets for use with tools like Gephi and Tableau. I viewed this as exercise in preparation for all the data I hope to work with in the future. Nevertheless, I hope individuals interested in visiting Long Island Wine Country and interested in optimizing their visit might enjoy the work I’ve done so far!