Networking the Belgian Beer Landscape


Visualization

Belgium is famous for many things: waffles, fries, and chocolate among them.

But beer is probably at the top of the list. The Delirium Cafe in Brussels has a beer menu that more resembles a book. It contains over 2,000 different brands and currently holds the Guinness World Record.

In order to get a better sense of the Belgian beer landscape, I looked up the list of every beer: http://en.wikipedia.org/wiki/List_of_Belgian_beer

After copying this data into Microsoft Excel, I adjusted it so that I could upload the data into Gephi.

There was a total of 1594 beers. The list included four categories – Beer Brand, Beer Type, Alcohol Content, and Brewery, which I made into nodes.

Then, I created a spreadsheet for the Relationships. The three relationships were Beer Brand “isa” Beer Type, Beer Brand “HasAlcoholPercentage”, and Brewery “Brews” Beer Brand.

I uploaded the node table and relabeled the first two edge table columns to source and target so that I could upload them as well.

Once in Gephi, I received my initial graph of the data. I ran statistical analysis and found the Degree, Network Diameter, and Graph Density. I found that my data was extremely dense, which was going to make it a challenge to visualize. In addition, because this was multinodal data, I was limited in how much I could show.

After understanding my data a little better, I ran Force Atlas 2 to get a better sense of the visual network. The graph remained very dense, so I decided to play with the nodes.

Version3_2

I realized that ABV had the least percentage of connections and decided to filter that out. I then re-sized the nodes on my graph and got a great sense of the most popular beer types in Belgium.

I moved on to the preview section and tried to adjust the visualization so that it was more readable. I labeled the largest nodes so that it was clear to see the names of the most popular beer types.

However, there was one last step that I wanted to run in hopes of separating the sections. Despite the fact that my data was multinodal, I ran modularity.

After increasing the modularity to 3.0, I discovered four sections, separated by color: red, purple, teal, and yellow. I would have liked to edit the colors, but Gephi doesn’t offer great functionality for this.

Despite this, the colors do offer the viewer a clear sense of the most popular types of Belgian beer and which of those types are most related.

The data did make some sense in the end, in that the beer types in the same sections are similar.

 

Here is a view with larger labels:

Version5

 

 

The yellow section was made up of following main beer types: blond, triple, bruin, amber, and witbier

The red section: pilsner

The purple section: fruitbier

The teal section: hoge gisting

Based on the size of the nodes, Hoge Gisting proved to be the most popular beer type.

In future iterations, I would like to create a geospatial visualization with this data. It would be interesting to see what the most popular beers of different regions in Belgium are, and maybe which types are region-specific.