Introduction
Last summer I became very interested in the culture surrounding rock climbing and mountaineering. I started to educate myself about this culture, reading literature and watching documentaries. I quickly became fascinated with the folks who took climbing to the extreme, especially considering the that the risk of failure includes death, and I wanted to use a dataset that incorporated them. I stumbled across a website dedicated to all information about the 7 Summits, the highest mountain on each of the 7 continents. The website also included datasets of statistics about the people who have successfully summited the 7 Summits. For this lab, I decided to use the dataset on the website that provides statistics about all the climbers that have successfully summited both the Carstensz Pyramid and the Kosciuszko versions* of the 7 Summits, the two main versions within the climbing community.
Some of the topics, according to the dataset, that I wanted to explore included
- Gender representation**
- Nationalities of climbers
- What months are most popular for each mountain
- If certain mountains are favored to be summited first or last
Inspirations
Before starting on my own visualizations I conducted some research to find visualizations related to mountaineering/climbing. Below are three visualizations I found that served as inspiration for my project. They either reinforced an idea of one of the topics I wanted to explore or helped to contextualize what a gigantic achievement the climbers in my dataset undertook.
Above (Figure 1) is a visualization of the 7 Summits. It depicts the mountains according to elevation (in feet and meters) and juxtaposes them against one of the Great Pyramid of Giza to help contextualize scale. I enjoy how the designer chose to use images of the actual mountains, conveying climbing terrain.
The above visualization (Figure 2) is an infographic I found about gender in climbing media. Gender and gender disparity is becoming a more prevalent topic of discussion in the climbing community and it was one of the topics I wanted to explore with visualizing my chosen dataset.
The infographic above (Figure 3) depicts information about the best times to visit the Mount Everest base camp according to weather. This helps to convey how weather conditions play a crucial role towards successful summits and how mountains have climbing seasons.
Materials
For this Lab, I used a dataset found on the 7summits.com website. The dataset included all of the climbers who have successfully completed two versions of the 7 summits, along with information about them and their climbs (birthday, the first mountain summited, the last mountain summited, time, dates, etc). I used OpenRefine in order to clean up some of the columns of data and Tableau Public in order to create the visualizations.
Method
After finding the dataset, I copied the information into an Excel and saved a copy. I then imported that file into OpenRefine in order to clean up some of the data, specifically the time columns. After changing all time columns to be just total days, instead of year and day in the same column, I save the file. I then imported the new cleaned up dataset in Tableau Public Desktop and created a new project. Tableau split my data columns into either dimensions or measures. I then used different features and functions Tableau offers in order to create my visualizations.
The visualizations I worked on can be found by clicking the below link. Once you are taken to my Tableau Public page you can look at different visualizations I created by scrolling down to the Metadata section on the bottom right and clicking the different sheets and dashboards.
https://public.tableau.com/views/Lab1_153/Sheet10?:embed=y&:display_count=yes
Discussion of Visualizations
I was surprised about how many of the climbers in the dataset were from the United States. It wasn’t until I created the map showing the relationship of climbers and countries that I realized this (Figure 4). I also made a bar chart to convey this information as an at-a-glance view (Figure 5).
I was not surprised about the disparity between male and female climbers. I made a simple bar chart (Figure 6) to show the difference. I also wanted to see if there was a positive trend for female climbers as depicted below in Figure 7. The visualization does show a slight positive trend, but I feel that not enough time has passed/there is enough data to make a concrete conclusion. The visualization does show that more climbers are successfully climbing the 7 Summits as the years go by.
Figure 8 below, a heat map, is interesting as the visualization does seem to reflect climbing seasons for individual mountains, with the exception being Mount Kilimanjaro. It is also interesting to note that not one climber started in the month of April.
I also wanted to see if climbers started and ended with the same mountains. Figure 9 is the dashboard I made containing two different heat maps portraying this. The top map is the Carstensz Pyramid version of the 7 Summits and the bottom map is the Kosciuszko version.
Future Direction
Moving forward I would like to clean up my visualizations. For instance, I would edit the labels of all axis with a more intuitive title and label the sheets to make the visualization more user-friendly. I would fix color usage for better consistency. I would also create better-curated dashboards and spend more time playing around with that feature. I would also like to spend more time with sheet 1, 17, and 11 which deal with time and age information. I was still in the beginning/experimental stage of creation and do not think they do justice for answering the questions I was asking when creating them.
I would like to incorporate leap years into the time columns of my dataset, something I did not do in OpenRefine due to the difficulty of accuracy without going row by row.
I think many of my visualizations serve as a jumping off point for asking more questions. For instance, sheet 14 and 15 illustrate the starting and ending mountain of climbers, but I would like to investigate this further. One way this could be done is to use another dataset that includes each of the mountains difficulty ratings to see if a pattern emerges. I would like to know if a climber was sponsored, money being a huge obstacle when it comes to completing the challenge and see if that influences the amount of time it took climbers to complete. I would also like to expand the dataset to include the order and date of every mountain the climber summited to get a complete story. Lastly, I think it would be interesting to include summit attempts that failed.
*Please note that different versions of the 7 Summits exist depending on the definition of a continent and accuracy of elevation readings at the time. Climbingthesevensummits.com has a page defining the 7 Summits that does a good job explaining the differences between versions, for those who are interested.
**Please note the dataset included only female or male under the gender column. I cannot be sure how gender is being defined in this dataset or if gender identity has been taken into account.