What’s the connection?


Visualization

Introduction

This research explores the dataset from the Biological Networks category on the Gephi Wiki. “Diseasome” shows a network of interconnected diseases by gene associations. This project was based off of the Human Disease Network study whose objective is to “…link systematically all genetic disorders (human phenome) with the complete list of disease genes (disease genome) resulting in a global view of the ‘diseasome,’ the combined set of all known disorders/disease gene association. ” My question is about the wiring of these diseases. Which disease has the most connections and what other diseases are they closely connected to?

Background

There is said to be anywhere from 7,000 to 70,000 diseases in the world with just 500 treatments. According to Cindy McConnell, a spokeswoman at NIH’s National Center for Advancing Translational Sciences, who takes a more modest approach, “We (NCATS) generally say: Several thousand diseases affect humans of which only about 500 have any U.S. Food and Drug Administration-approved treatment.” We are still in a pandemic so debates about the amount of diseases that exist in the world are meaningless albeit astonishing without context. This particular fact check took place in the Washington Post as the current administration were working hard to prove that the FDA should speed up their current rate of approval. There has been a push to pass certain drugs in order to treat Covid-19 without proper testing but this could be a huge mistake. Take for instance the drug Thalidomide which was given to pregnant women with disastrous results in the 60’s.

This network map shows where diseases and disorders are linked, where one gene mutation can lead to a disorder or multiple disorders. Observing the network was inescapably overwhelming. Literally, it seems like there is no where to hide from this often fatal or debilitating web. In an effort to be more optimistic my first thought was to look into a network focused on Mycology to observe mycelium networks. Often times fungi are used in modern medicine to cure certain diseases and promote other health benefits, this practice has been around since antiquity.

Lastly, being optimistic doesn’t mean one has to avoid reality. Diseases and disorders exists and there is no way to get around that fact so I prefer knowing to not knowing as ignorance is a privilege I can’t afford. The African American community has been suffering due to systemic racism that’s not only ruining our health but also overrunning and destroying our healthcare systems.

Methodology

Utilizing the Gephi wiki library I decided to look for diseases that had the largest network connections and also ones that were disproportionately impacting the Black community. Since this was a GEXF, a Gephi file it opened much different then my past Gephi experience where I was met with a black dense box. Both of these methods were interesting as one you seem to initially have more control and the other you are almost working backwards. There were 1419 nodes and 3926 edges with an average degree of separation being a 2.76. I started with the “Force Atlas Layout” but moved to the “Label Adjust” in order to see the labels more clearly.

figure 1

By utilizing the “degree range” function within the “Topography” folder I was able to filter out the diseases and disorders that had at least a 26 connections.

I don’t think it’s surprising that Colon Cancer has the largest connection but it’s interesting to see not only the breadth but the depth. The reach of this network extends to places that I hadn’t realized and because this and breast cancer are two of the most deadliest to African Americans it’s important to look into all of the smaller communities of diseases it’s connected to. Diabetes, Cardiomyopathy, Thyroid Carcinoma are all within the same cluster as Colon and Breast Cancer which are all clustered under the cardiovascular, cancer and skeletal classes. The image below showing orange as cancer, fuchsia skeletal, green cardiovascular and the isolated blue represents developmental.

Results

After working with numerous filters I was able to create visuals for different concepts or ideas I wanted to convey. Again, showing the complexity was important to this visualization because the topic is vast but also simplifying the visualization down to a single disease was equally significant. This way shows the possibilities and may help with getting to the root of the problem without it feeling so mechanical. Isolated one disease allowed me to humanize this network that at first felt very unreal and simulated.

While talking to a friend who is an Epidemiologist she told me that 99% of our serotonin exists in the gut. While doing these visualizations I observed gastric cancer having one of the greatest reaches within the network. What is the norm in a sick society? Gut health is mental health and if large populations of people particularly Black and Brown people are being deprived of their basic human rights there are at least two very large issues to address, which is why we say housing is healthcare.

Further Directions

I would like to study this alongside race/ethnicity, gender, class and location. This is a study on human diseases yet no human is represented here. It’s crucial that we start being explicit about who these diseases affect and how so we can start holding our governments accountable. When we build the framework we can start utilizing it when demanding proper equitable treatment. It would be a large undertaking to put all of this data together but the work is already in progress, it would be nice to see a more collaborative approach. It would also be interesting bring in a data set that shows all the benefits and connections fungi have on particular diseases/disorders.

References

https://www.washingtonpost.com/news/fact-checker/wp/2016/11/17/are-there-really-10000-diseases-and-500-cures/

https://www.pnas.org/content/104/21/8685

https://github.com/gephi/gephi/wiki/Datasets