Introduction
The pharmaceutical industry in the U.S. has experienced immense growth, especially given the fight against COVID-19 throughout 2020. But could some of these carefully developed and tested drugs from the past, even when administered or prescribed by physicians, ever result in harm?
Some drugs can produce detrimental effects if mixed with certain food, alcohol, or even herbal supplements. One type of interaction that could result is known as toxicity. Toxicity can be defined as having detrimental effects on one or more organs or systems. These effects can range from reversible to life-threatening. Here I’ve decided to discover and explore the most highly interactive drugs related to drug-to-drug toxicity but utilizing network visualizations in Gephi.
Comparative Analysis
In beginning this project, I came across the following drug interaction network, also made in Gephi. The modularity classes are color-coded and show clustering by drug indication (use). While I wanted to explore how different drug types can interact with each other, I thought it would be interesting to focus on specific interactions instead. I really liked the use of bright edges to direct visual attention to how interactive seemingly unrelated drugs can affect each other.
Materials
DrugBank is a public resource providing users access to search pharmaceutical information including drug specs, descriptions, and structures. To access data specific to drug-to-drug interactions (DDI), I used an extracted dataset from Mendeley Data. This provided 3 CSVs including the DDI data, DDI types, and DDI merged types.
The information was then imported into OpenRefine for cleaning. I also added the DDI types to each row to describe the type of interaction the source and target drug had.
Finally, the CSV was imported into Gephi, an open-source network analysis software, to begin the visualization.
Methods
While I originally imported all 200k+ drug interaction edges into Gephi, I found the visualization to be way too slow and complex to work with. There were also over 20 different types of drug interaction categories. I pivoted instead to focus on a specific interaction category that involved fewer edges. In OpenRefine, I narrowed down the data to include those interactions involving toxicity only. I added an additional column to specify that the edges were Undirected, as prescription drug interactions are not affected by drug order.
Once the data was exported as a CSV, the work in Gephi began. I imported the edge list into Gephi’s Data Laboratory tab, where Gephi automatically generated a nodes list of drug names based on the unique strings from the edge list’s sources and targets. I then copied the node Id data to the Label column to make sure the nodes would display drug names as labels in the final visualization.
After switching to the Overview tab, Gephi’s default settings display all data as a simple black visualization, as shown below:
In the Statistics tab, I ran several calculations: average degree, network diameter, modularity, average path length, and average clustering coefficient. The average degree was 8.5, meaning each drug on average had known toxic interactions with 8-9 other drugs; meanwhile, the toxicity effects of even just mixing two of these drugs could be devastating.
After looking at several different Layouts that would determine the shape the visualization took, I chose Force Atlas, and then Label Adjust. Without Label Adjust, the drug names were illegible due to overlapping.
I then applied a Range filter to Degree, to ensure only nodes with 3+ degrees were included and set the average clustering coefficient to .20. This helped to eliminate some drugs that were not very well integrated, such as Phenobarbital for example, as shown below:
Since the data listed each interaction type for every edge, it was simple to partition the edges to have them color-coded by interaction. When I then partitioned the nodes by modularity, it matched up well with the layout Gephi had clustered them into.
The final visualization can be seen below:
Interpretation
Once the visualization was complete, it was very clear which drugs should hold a bad reputation for toxicity interactions, and for what type of toxicity.
Defintions:
–Cardiotoxic activities: damaging to the heart muscle
–Nephrotoxic activities: damaging to the function of the kidneys
–Hepatotoxic activities: damaging to the liver
–Neurotoxic activities: damaging to the nervous system
–Ototoxic activities: damaging to the ear (balance)
Overall, cardiotoxicity was the most common toxic event, with over 60% of the known toxicity interactions falling into this category. Represented by the cluster of pink, these connections commonly include interactions with Acetyldigitoxin and Deslanoside (224 connections each), which are both used to treat cardiac failure. Cyclophosphamide (212 connections) is a chemotherapy and immunosuppressive drug, which also had a high degree of connections. While these drugs can have therapeutic and life-saving benefits on their own, it’s obvious that they must be administered under careful supervision to prevent detrimental effects on the heart.
In contrast, the neurotoxic activities were all isolated in a cluster to the left, representing only 19 different interactions, in which the majority involved the drug Atomoxetine- a noradrenaline reuptake inhibitor. There is evidence in rodents that this drug can have neurodegenerative effects on the cerebellum and hippocampus, but mixing it with other chemicals provides even greater potential for damage.
Overall, the most interactive toxic drugs can be seen below:
Reflection
Working with toxicity was really interesting, but future work could involve exploring the other interactions as well. From the beginning, I was also hoping to include the common use for each drug as another dimension but was unable to find a free public database that would allow me to extract this information. While I am familiar with pharmacology to a certain degree, it is impossible for me to recognize all the drug nodes that I made use of. For future work, it would be interesting to produce a hypergraph as well based on the modularity information.