NETWORK ANALYSIS on the developers and the modules of the Perl language


Lab Reports, Networks

INTRODUCTION

Perl is a programming language that a family of high-level, universal, literal, and dynamic programming languages. Originally designed by Larry Wall to make reporting on UNIX more convenient, he decided to develop a generic manuscript language, which was published on December 18, 1987. CPAN is a online community contains a lot of software and its files written in Perl by the developers. Conducting a network analysis can display the relationship between the developers and the modules of the Perl language by linking them when they use the same module.

DATASET & SOFTWARE

The dataset was found in Github, created by Linkfluence in July 2009 and the original data has been lost. Fortunately, the data set was already organized so there was no need to do data cleansing.

Gephi is an open-source network analysis and visualization software was used to analyze and generate a visualization of the network data.

METHOD AND RESULT

First, I wanted to see which developers are the most popular. The degree represented the number of developers used his/her modules. The greater the degree, the more people used the modules.

Figure 1.0

When I imported the data to Gephi, I needed to run the avg.degree, network diameter, and graph density to computes network statistics and detect clusters. Then I chose rotate as the layout and the data looked like a mess( Figure 1.0 ).

I decided to set the ranking of degree between 0.5 and 10 which I only wanted to show the developers with the most modules without making the rest visible. I then dragged the highlighted data to the edge and turned it into a star shape to make it more visible. Color was also added from the Partition on the Appearance. Finally, the label and increased its size were added in the preview mode.

( Figure 1.1 )

From this graph, it is very obvious to lots of people used modules were created by Gisle Aas since it has the highest density(Figure 1.1). Then is Andy Lester, Adrian Howard, and Stevan Little.

Later, I wanted to see how many modules were used by the developers at the same time. I chose OpenOrd as the Layout to display the distance between the developers. From the graph(Figure 2.0), it shows that people would use the modules was developed by Gisle Aas, Andy Lester, Adrian Howard, Stevan Little at the same time, while few people chose to stick with Rocco Caputo and Apocalypse.

(Figure 2.0)

REFLECTION

Gephi was very fun to play with but it was also very complicated. For example, it took me a very long time to figure out how to increase the data size by ranking its degree and increasing its font size while making the rest disappear. I had to look at the online videos to see how people visualize their data and their method didn’t always work as I wished. I want to learn more about targeting the specific data and its edges on Gephi. It is the most important thing about we highlight the data so the audience will be able to understand what information we want to tell them.