Crypto Prices and Twitter


Visualization

Background

Introduction

Cryptocurrencies are a form of digital currency that is maintained by a decentralized system using cryptography, rather than by a centralized authority. The most popular cryptocurrency is Bitcoin, which was also the first cryptocurrency to be created. Since Bitcoin’s first release in 2009, over a thousand other cryptocurrencies have been created. 

While the longevity of cryptocurrencies is still debated, its relevance is undeniable. The current market cap for the 2 most popular assets, Bitcoin and Ethereum, is nearing 1.5 trillion US dollars.

For this project, I wanted to understand the relationship between the price of a cryptocurrency and its social media prominence. For the social media component, I selected Twitter as the data source. I selected Twitter as a social media representative because it is a de facto news source and social channel for crypto enthusiasts and users tend to tweet their thoughts in real-time. Reddit is also a popular social channel for crypto enthusiasts, but tends to have longer, more descriptive posts.

The three cryptocurrencies that I will explore in this project are Bitcoin (BTC), Ethereum (ETH), and Dogecoin (DOGE). Bitcoin was selected because it was the first cryptocurrency created and remains to be the most popular and most valuable. Ethereum is the second most popular cryptocurrency, and provides the technical foundation for many other crypto exchanges (like NFTs). Dogecoin was the first cryptocurrency created as a joke, but continues to maintain a popular following and online community.

The mechanics of cryptocurrencies and blockchain are beyond the scope of this project, but further information on crypto basics can be found here on Coinbase. For the purpose of this project, we can think of cryptocurrencies as a type of digital asset that can be traded like traditional stock

Inspiration

I drew inspiration for this project from two projects completed for this class earlier in the semester. In my previous projects, I studied the history of stock charts and Twitter reactions to the Hong Kong National Security Law approval. For my final project, I wanted to combine these learnings with my personal interest in crypto.

A visual inspiration for this project was Google Trends. Google Trends allows you to see the interest in a particular subject over time based on how many people have searched Google for that subject. For example, below is the graph for Bitcoin and Ethereum search trends since 2009.

Another inspiration for this project and parallel work is Coindesk’s Social Analysis. Coindesk uses data powered by Into The Block to show the positive, negative, or neutral sentiment of tweets published in relation to the price of a crypto asset. I actually discovered this analysis after writing my proposal for this project and was shocked by the similarity, but decided to continue with the topic and explore a more granular scope than Coindesk offers.

Hypothesis

I predict that there will be a positive correlation between a crypto asset’s price and its Twitter prominence. I predict that as the value of the asset increases, so will the interest in the asset, so the tweet count will closely follow behind the increase of the asset.

User Research

I conducted initial user research in order to gather alternative hypotheses and gain insight into the cues needed to interpret a chart like this. I conducted 2 in-person interviews and integrated the findings as considerations into the final chart.

User Personas

These two profiles were created to represent user personas at opposing ends of the crypto experience spectrum. 

  • Crypto newcomer — Someone who has never actively engaged in the crypto space. The newcomer most likely does not have an account on any crypto platform and has never purchased crypto.
  • Crypto expert — Someone who actively engages in the crypto space, either by trading, mining, or purchasing crypto assets. The expert might also engage in social forums or be involved in building new crypto projects.

Participant Recruitment

Since I did not have a budget to compensate research participants, I chose to recruit 2 participants from my personal network. I acknowledge some recruitment bias here, as both participants are software engineers I know from a previous job.

Research Questions

Concept Testing
  1. In your own words, can you tell me what a cryptocurrency is?
  2. What is your experience with crypto?
  3. Do you currently hold any crypto assets? Why or why not?
  4. What are your concerns with crypto?
  5. What social media platforms do you use?
  6. How often do you post on social media?
  7. When you post on social media, what topics do you discuss?
Hypothesis Testing
  1. Plot the price of Bitcoin this year.
  2. Based on your visual from the first question, plot how you think the price of Bitcoin and social media mentions overlap.
  3. When shown a visualization of the above, what will you be looking for? What will help you understand what you’re looking at?

Research Session Summaries

P1: Crypto newcomer

A software engineer who has never given much thought to the crypto space. She considers herself a risk-averse trader when asked about her investments in traditional stock, and identified a high barrier to entry preventing her from getting more into crypto. P1 does not hold any crypto assets, and doesn’t anticipate purchasing any in the future. On social media, P1 frequently uses Instagram, TikTok, Facebook, and Snapchat and has accounts on Twitter, Reddit, and Tumblr as well. She uses social media daily, but only posts occasionally. P1 posts content on her own life, but doesn’t engage or post on other topics.

“You spend time solving little math problems and when you solve one it outputs a little money. Crypto is worth as much as other people are willing to pay you for it”

“It’s a lot of effort for very little pay off. It confuses me how to even get into it.”

During hypothesis testing, P1 predicted a positive correlation between the price of Bitcoin and the number of social media mentions of the asset. P1 expressed that she doesn’t follow the crypto market, but assumed that the price went up and down around the same time as AMC and Gamestop.

Participants were asked to plot the price of Bitcoin (green) and the number of social media mentions (red).

“I assume [the price of Bitcoin went up] sometime around when everyone was buying Gamestop and AMC … I equate those together”

“I assume the more people that have Bitcoin, the higher the price and therefore the more people are talking about it.”

P2: Crypto expert

A software engineer who has previously mined crypto. He considers himself to have a high risk tolerance with investments, but isn’t willing to invest more than he is willing to lose. When asked to define cryptocurrency, P2 offered an in-depth explanation involving key features of crypto and its technical foundations. P2 expressed some concern with crypto, including its harsh environmental impact, high transaction fees, and examples of exchange platforms going bankrupt. P2 has accounts on Facebook and Instagram, but very rarely posts on either.

“Transactions are logged in a ledger in the blockchain. Most cryptos have nothing backing their tangible worth, but purely have value because they’re able to be traded … because people place value on the individual token.”

“Mining is horrible for the environment. It’s extremely volatile. Transaction fees are very high to get in and out of the network. There’s been many cases of exchanges going under.”

During hypothesis testing, P2 predicted peaks in social media mentions at both peaks and valleys of Bitcoin price fluctuations.

Participants were asked to plot the price of Bitcoin (green) and the number of social media mentions (red).

“I think tweets go up any time there’s volatility in the price. I think overall, interest in Bitcoin has increased over time.”

Process

Tools

  • Twitter Developer – Twitter Developer is a publicly available API to search and query tweets. For this project, I used the Tweet Count API endpoint to collect twitter mentions of each cryptocurrency.
  • Crypto Compare – Crypto Compare provides historical data on cryptocurrencies. I used the Crypto Compare API to pull hourly price data for Bitcoin, Ethereum, and Dogecoin.
  • Postman – Postman is a developer tool that manages API calls. I used Postman to reach the Twitter API and to save the returned responses as JSON objects.
  • Colaboratory – Colaboratory is a Python notebook created by Google that allows you to write, compile, and host Python code without having to set up a Python environment on your computer. I used Colaboratory to write and publish my code and charts.

Methods

I began my data collection by creating my data sets. I applied for a Twitter developer account and discovered that the default Twitter developer access only allowed me to pull up to 7 days of Twitter data. In order to get access to the full historical data since Bitcoin inception in 2009, I applied for an academic research developer account. However, the criteria for academic research access required me to be enrolled in a PhD program, so I was unable to get this access. Instead, I rescope my project from historical to granular, and used the Twitter API to pull hourly tweet counts instead of monthly tweet counts as I originally intended. The API returned JSON responses, which I was able to use directly. For each cryptocurrency, I queried for both the name of the asset (ie, Bitcoin) and its asset code (ie, BTC).

For the crypto historical data, I originally intended to use Coinmarketcap’s Historical Data Snapshots, but found that the data was not granular enough for my new scope, as the data was collected daily rather than hourly. Instead, I made use of the Crypto Compare API to pull hourly price data. This API also returned a JSON response.

To create my visualization, I created a notebook on Colaboratory and imported my data. I selected Python for this project since I’m familiar with the syntax and find it to be a straightforward programming language. I mainly used 2 libraries to process and visualize the data: Pandas and Plotly

An issue I ran into while importing my data was discovering that the date-time formats for the Twitter and Crypto Compare data differed, so I used the Pandas date-time method to standardize the format. Another issue I found was that the scale of tweets and price varied and had to be adjusted by plotting the tweet count on the left and plotting the asset price on the right. 

Based on feedback from user research, I added an interactive functionality to the charts, so users can hover over a data point to see the exact tweet count or asset price. I also added an additional line that aggregates the Tweet counts for asset names and asset codes, as I felt this would be more representative of overall Twitter relevance.

Product

You can view my code and charts on my Colaboratory Notebook here.

Interpretation

I was also surprised to see the prominence of asset code (ie, ETH) mentions compared to asset names (ie, Ethereum). I predicted that the asset name would be more prevalent than the asset code. This prediction was true for Bitcoin but the inverse for Ethereum and Dogecoin.

Reflection

If I had the resources to further iterate on this project, I would have loved to explore my initial scope of full historical data. I did intentionally pull the data during a week that I knew the overall crypto market experienced a large crash, but I would have liked to view a larger scale to see both record-breaking gains and market crashes.

By looking at a larger scope, I would have been able to show the data daily rather than hourly in order to get around the clear cycle in Twitter activity that mirrors the days and nights in the US.An additional improvement I am interested in seeing is the overall number of unique people who hold Bitcoin and how that compares relative to an overall increase in Twitter mentions over time.

References