For this lab report, I wanted to explore the field of biological networks. This information visualization informs the user about how disease genes and disorder genes are associated. Apart from providing the user about how genes are related to disorders, they can also help in a potential solution for it. General users can view this visualization to expand their knowledge regarding diseases and genes. However, researchers can view this visualization to study this topic and help in finding a cure for these diseases.
Inspiration & References
Inspiration was taken from various sources to start with my visualization. These inspirations helped me to understand my topic and guided me to select the right way to present my data.
This class lab report was my primary source to get the information. It provided some great visualization and example to go about my lab report. This lab report also helped me to understand the topic and provided me with a clearer view regarding what I want to achieve with my lab report.
This source was a great way to look at different visualization. It helped me to understand various network layout and also how to use colors responsibly in any visualization.
I explored various websites to finalize the data for my visualization. Gephi-Wiki is one of those websites which provided meaningful and exciting data. For this report, I selected Biological Network data that informs about the network of disorders and disease genes linked by the known disorder gene associations, indicating the common genetic origin of many diseases.
After finalizing the topic, Gephi is used to explore different visualizations. Since my file was .gexf, it was easy to import in Gephi. Gephi also supported with editing the data within the software. Once the data was refined and polished, I started exploring different layouts for my report. I also explored the filter option to filter out data with modularity, degree, etc. After finalizing the appropriate visualization for the data, Illustrator is in use for implementing the final changes.
After saving SVG of the visualization, I have used Illustrator to execute some final changes. Illustrator provided with the support to adjust the layout and place the legends.
The visuals for this data also follows a top-down process. This visualization tells the story from a macro to a micro level. Overall the visuals for this report have been kept minimal and easily distinguishable. Vibrant colors have been used on visualization so that viewers take this topic more engagingly. These colors stand out very well from each other and also on any colored background. Listed below are three visualizations created for this data.
1 – Overall Disease and Gene relation
Here I have presented how many genes and diseases are related to each other. This layout has a hierarchy of degrees and classes. The pink color represents the genes, whereas the blue circle represents the diseases. The size of the disease node represents the number of links with a gene. For example, the bigger nodes mean more links, and the smaller nodes signify fewer links with genes. This visualization helps the viewer to visually identify the overall link between genes and disease.
2 – Disease association with the degree
Here the visualization informs the user about the degree association of a disease. With this, the viewers can easily understand what disease has the most number of association with any gene. For example, Deafness and Colon Cancer have the most number of associations and hence are standing out from others. This data helps the user to only focus on diseases and their association with others.
3 – Colon Cancer
This visualization helps to focus just on one disease – Colon Cancer. The idea behind this is that we never know what disease might lead to something higher. Hence with this visualization, I am targeting all the neighbors related to Colon Cancer. Blue links here represent the neighbors of neighbors to Colon Cancer, and the Pink links represent the neighbors of Colon Cancer. This visualization sets a great example to study other diseases that may lead to colon cancer.
This information visualization targets general viewers and researchers. It explores the boundary of the biological network by providing meaningful information. With this visualization, viewers can view the data collected or focus just on certain things. With this report, I tried to maintain a balance between aesthetics and information in my visualization. This data has endless possibilities, and everyone can explore it in many different ways.
This exercise taught me to explore network data and also to have a different approach for them. I believe that with the network data-set, every outcome can be unique in itself. Gephi is a powerful software to explore these endless possibilities. Not only it provides a function to add aesthetic to visualization, but it also helps in manipulating data. The external plugins are also helpful for pushing the creative boundaries in Gephi. With my visualization and data-set, I can work on them in the future as well. For example, What if data of COVID-19 adds up to this data-set. We may know how COVID-19 is related to other diseases and genes. We can also predict the risk factor and survival rate for a human being. Before starting this lab report, I believed that network visualization is just aesthetically pleasing, but now I understand that they hold a lot of valuable information. This exercise helped me to look at network visualization differently and also guided me to always prioritizing the information.
- Gephi Wiki “https://github.com/gephi/gephi/wiki/Datasets“
- Stanford Large Network Dataset Collection “https://snap.stanford.edu/data/“
- Visual Complexity “http://www.visualcomplexity.com/vc/”
- Castillo, Z., A network analysis of human diseases “https://studentwork.prattsi.org/infovis/labs/a-network-analysis-of-human-diseases/“
- Jing, T., Visualizing the network of disorder and disease genes “https://studentwork.prattsi.org/infovis/labs/visualizing-the-network-of-disorders-and-disease-genes/“