UFOs and Bigfoot


Visualization

1. Finding Data

When looking for data, I came across a list of bizarre datasets. Two of the datasets listed were sightings of UFOs (I used the “scrubbed” version) and Bigfoot (I used the “complete” version). Both datasets originally come from websites that let vistors report their sightings, Bigfoot Field Research Organization (BFRO) and National UFO Reporting Center (NUFORC).

While examining either one of these datasets alone would have been interesting enough, I felt both of these datasets represented popular myths and conspiracies in American culture. Both are shrouded with mystery and speculation yet there have been many reports over the years of sightings of these elusive creatures. I wanted to examine the reports of these two phenomenon side-by-side to see if there were any common patterns between the two.

2. Inspiration

When coming across the data, I knew people in the past have used maps to visualize UFO and Bigfoot sightings before. However, I had usually seen them as dot density maps like these:

UFO Sightings
Source: Vox
Bigfoot Sightings
Source: Joshua Stevens

While I enjoy these maps aesthetically and there are some trends able to be gathered, there are some obvious clusters that can be explained due to the population density of the area. I wanted to try to reduce population density’s role in the visual representation of the clusters.

Joshua Stevens does try to solve the population desnity problem by providing a mini map that does analyze the amount of reported sightings versus population of density by county. However, I found the color matrix difficult to follow along. For my map, I would like to just showcase if the sightings are proportionally high for the area.

I especially enjoy Joshua Stevens’s addition to the Bigfoot Sightings map of the population versus reportings graph and pointing out a couple of pop culture events on the same timeline that may have influenced the influx of sightings. However, I find the double X-axis a little confusing, especially because the population axis does not start at zero and I wish that the pop culture events continued past 1967 as it almost implies that the US’s interest in Bigfoot grew without any media influence.

3. Cleaning Data

Before cleaning the data myself, I uploaded the data files to Tableau just to experiment with different formats and to see what I needed to fix. As expected, the resulting maps were more or less population density maps.

Cleaning the data for this project was a bigger ordeal than I initally thought. While all the data had been cleaned previously and was in pretty good shape besides some missing values, I had to reformat and bin the data to match my purposes. I used Python to scrape population data (with help from this medium article on converting html tables to pandas dataframes) from the Wikipedia page for population of the US by decade. To see the original dataset, cleaned dataset, and code, check out my GitHub repository.

A rundown of the edits I made to the dataset:

  • Removed rows with no date data or outside the US: This removed about 1/5th (~5200 rows originally) of the records from the Bigfoot dataset due to no date and about 1/4th of the rows (~80k rows) of the UFO dataset due to being outside of the US
  • Removed irrelevant columns: I removed any columns not related to the date and location of the sighting. While I could have pared down the columns to just year and state, I decided to keep the extra information in case I or some else wanted to use the data.
  • Added column names: The UFO dataset had no column names so I had to add them myself. I was able to confirm what the column names were supposed to be by checking the original source of the dataset, the National UFO Reporting Center site.
  • Breaking up date: I broke the date into year, month, and day columns to make it easier to bin into decades.
  • Decade column: I wanted to analyze the data by decade so I made a column categorizing sightings by decade.
  • Normalized population: After talking to Professor Sula about finding a way to normalize the data for population across the decades, they suggested I weight the data by assigning each record a value of 1 divided by the state population for the decade. I added this data as an extra column.

4. Final Graphs

After cleaning all the data, I put the data back in Tableau Public to visualize it.

Map of UFO Sightings (population normalization)
Map of UFO Sightings (no normalization)
Map of Bigfoot Sightings (population normalization)
Map of Bigfoot Sightings (no normalization)

To see a decade breakdown of sightings, please click on the buttons above.

Key Decisions

  1. Using the same diverging color palette for both maps: While I had originally went into this project assuming I would color the UFO and Bigfoot maps two different colors/palettes, I decided to keep them the same. My first and foremost concern for these maps are showing similar outliers between the two sets of data. I used a diverging color palette with the center set as the mean so states with proportionally higher reporting rates would stand out more. I kept the same color palette between two maps so viewers could easily compare the two maps and see the similarities and/or differences.
  2. Limiting decades seen in decade view: I filtered off all decades before the 1950s due to very little activity beforehand. I also filtered off 2020 as the data was not present in both datasets.
  3. Offering line graphs of sightings over time: While I believe the decade view of the map hints at it, I wanted to make it clear to viewers that the amount of reportings has increased over time. I added this graph in direct inspiration from Joshua Stevens’s Bigfoot graph.

Findings

While at first glance at the map with no population normalization, population hubs like Florida, Texas, and California seem to have a significant portion of sightings. However, with population normalization, the maps tell a different story. The pacific northwest seems to have a particular fascination with UFOs and Bigfoot, with Washington as a leader for both UFOs and Bigfoot sightings.

5. Reflection

Overall, I struggled with the lab in getting the visualization to a place that I thought well represented the data. While I am happy I was able to normalize population and see the true pattern, I was disappointed with a couple of aspects:

  • State level aggregation: The state level aggregation gives a less precise view of the data than other maps. I would have loved to aggregate by county, but there were some issues:
    1. County was not included in both datasets.
    2. Population by county by decade was not readily avaliable, especially given how far back some of the data goes.
  • Two different Tableau files: While I originally wanted these maps side-by-side, I had trouble getting Tableau to work with both datasets in the same file. I recognize that having both visualizations in two separate files makes it harder for viewers to compare the maps on their own, but hope in the future to fix it.
  • Sightings by decade visualization: Though a big focus of mine was supposed to be the decade view, I found the resulting visualization to be underwhelming. Only the 2000s seem to have any interesting results as it held the spike of reports for the entire dataset. The coloring of the sightings by decade map was set to a diverging color palette like the rest with the mean as the center. However, I think the best approach for this would have been to have an individual color palette for each decade with the center as the mean for the decade. Unfortunately, I could not find a way to change the color palette based on the current page.

Future Direction

  1. Work on a lower level and focus on more recent years: I would want to work on the county level to get a more granular view, but still normalize for population. Due to limitations in population data for counties, this would also mean to focus on more recent decades. However, it works out for there is more report data to work with.
  2. Video version: I wanted to make a video version of my decade view, but I think it would work best with the county level visualization.
  3. Infographic: There are a lot of graphs avaliable to look at and compare in my final visualization. I would love to compile them all into one infographic that gives more context for the graphs such as pop culture events like Joshua Steven did. Possibly even add addtional context about UFOs and Bigfoot.