A network analysis of the monopolies on international cricket

Introduction

Ever since the “Big Three” model was introduced in the International Cricket Council (ICC) in 2014, there has been a heated debate about the disbalance in international cricket between different playing nations. The so-called ‘Big Three’ are India, Australia and England – the three nations with the largest revenue generated from international cricket tournaments. These three nations have argued that since they generate the highest amounts of revenue for the sport, they should have a bigger say in how the sport is governed worldwide. This led to the much-contested ‘Big Three’ restructuring of the ICC. Opponents of this model argue that there has been a historic bias in the ICC in favor of these particular nations which forms a self-serving basis for this disbalance. Many have argued that the high concentration of cricket within the limited number of countries has hampered the growth and development of the sport worldwide. In this network analysis, I paint a picture of the disparity of the frequency of matches between the different playing nations. The results demonstrate that since the year 2000, the Big Three have played the most number of matches overall and have played the least with the ‘Associate’ nations among all of the Test playing teams.

Some key information to know about Cricket

Test Cricket: Cricket was originally played in the five-day format. Even though this format is seen as the most prestigious form of cricket, it is no longer widely viewed. Only the top performing teams are allowed to play Test Cricket.
One-day Internationals (ODIs): A new format of the game introduced in the 1970s wherein one match only lasts for a maximum of one day. ODIs are more commercially successful for the ICC and more widely viewed than Test cricket.
Test Playing Member: Those top-performing teams which have gained ‘Test’ status under the ICC and are allowed to play Test matches. All Test playing members also play ODIs.
Associate Member: All those member nations who are not Test playing members. These teams are not automatically included in the ICC World Cup. In each cycle, the ICC holds qualifying matches for these teams to pick two to three who get to participate in the ICC World Cup with the Test playing nations.

Dataset

I use the record of all ODI matches provided by Cricinfo. For the purpose of simplicity, I only scrape the data starting from the year 2000 till 2020. I also use the list of Test playing nations to categorize the team nodes into four types. The four types are:

Big Three: India, Australia and England
Test: All other Test playing nations apart from the Big Three.
Associate: All associate playing nations
International: All ad-hoc teams like the World XI and Asia XI

Other visualizations of ODI matches data

The large number of data points available for each cricket match mean there are countless different ways of visualizing the different dimensions of the sport. Some excellent examples include: this visualization technique for comparing batting and bowling techniques between two teams and this visualization of competitiveness by analyzing the change in rankings for different teams. However, I found limited examples that utilized network analysis as the primary technique for visualization. Two notable examples are show below.

This visualization by Joshi (2020) utilizes the network approach to analyze synergies between batsmen in the Indian Premier League

This network visualization by Hussain & others (2019) use network analysis to visualize ODI matches data

This study builds further on this visualization of ODI matches data shown above. In particular, this study uses statistics such as density and modularity to differentiate the different teams and it also includes all of the matches played by the Associate teams.

Process

The two software I used for this analysis are Gephi – an open source graph visualization software – and MS Excel to collect, clean and structure the data. The first step of the process was copying the data from the Cricinfo website and pasting it into Excel. I then removed the unnecessary columns and renamed the columns to ‘Target’ and ‘Source’ so that Gephi could recognize them. Each row in the data represented a single match played between two teams. Each team would become a node and each match would be an edge. I set all of the edge types to ‘undirected’ and gave all the edges an equal weight of ‘1’. This gave me the edge table which I could import into Gephi.

An important point to notice is that while following the steps of importing the data in Gephi, I specifically selected the option to ‘not merge’ the edges. This option is available in the advanced options area with the dropdown ‘merge edges’. This is because since a large number of edges represent the same teams playing against one another, Gephi will try to merge all of them into one.

Within Gephi, the first step I followed was to implement the Force Atlas 2 layout which develops a standard force-directed network. Since I wanted to make a whole network and focus on all of the relationships between the nodes, I felt this was the best layout for this analysis. Secondly, I generated some basic statistics such as Density and Gravity to develop a sense of the interconnectedness of the data. Lastly, I ran the Modularity algorithm to generate clusters within the data. After this, I linked Density (number of matches played) to the size of each node and the modularity cluster to the color of each node.

Results

Some overall statistics:

Nodes (teams): 27
Edges (matches): 2730
Average Degree: 202.2 – On average each country has played a total of ~202 matches
Network Diameter: 5 – You can reach any country in the network from any other country on average by 5 jumps as a measure of interconnectedness
Graph Density: 7.778 – The density is high because all countries have played against each other at least once so all possible connections exist in the network

The overall network in Fig 1 given below clearly highlights the disparity of the number of matches played by the Big Three, the other Test Playing nations and the Associate teams. The Big Three have the highest densities and are closest together. The Test Playing nations in green play the most number of matches with the Associate teams.

Fig 2 below differentiates the node colors on the basis of the Modularity class assigned by Gephi. When I compare the pattern in the class with the frequency of the matches played between two teams, I can confirm that the modularity class shows those teams which play the most number of matches together. So, Bangladesh and Zimbabwe being the newest Test Members have played the most matches against each other. Similarly, all the Associate teams play the most matches amongst themselves.

Fig 2: The same network with nodes differentiated on Modularity

Looking solely at the network of associate teams, its evident that Afghanistan and Kenya have played the most number of matches. This correlates with the frequent qualifications to the World Cup by these two counties.

Reflection

The analysis clearly shows that the strongest teams get to play the most number of matches and they also tend to play amongst themselves. However, the disparity is not as wide between the Big Three and other Test playing nations as it is between the Associate teams and the rest of the world.

One of the key ways this analysis can be improved is by considering the ‘host’ and ‘away’ team for each match. This can help establish a direction for each edge in the network which might uncover an imbalance in who gets to host matches the most and who plays the most away matches as well.

Information Visualization

Student work at the School of Information, Pratt Institute

A network analysis of the monopolies on international cricket

Introduction

Some key information to know about Cricket

Dataset

Other visualizations of ODI matches data

Process

Results

Reflection

Introduction

Some key information to know about Cricket

Dataset

Other visualizations of ODI matches data

Process

Results

Reflection

Related posts: