Text Network Analysis For West Bank


Lab Reports, Networks

Introduction

The West Bank is a landlocked territory near the Mediterranean coast of Western Asia(Fig. 1), bordered by Jordan to the east and by the Green Line separating it and Israel on the south, west, and north. The West Bank also contains a significant section of the western Dead Sea shore. Both Israel and the Palestinians have resorted to terrorism at various times during the course of their long conflict. Palestinians’ will to commit terror attacks in the West Bank is still on the rise.  

In this visualization, six major terrorist groups that operate in the West Bank are analyzed. These groups are the Al Aksa Martyrs Brigades, Al Fatah, Al Qaeda, Hamas, Hezbollah, and the Islamic Jihad. This dataset gathered the 18 texts from that the networks were extracted from LexisNexis Academia via exact matching Boolean keyword search for each of the groups. The media searched with LexisNexis was The Economist, The Washington Post, and The New York Times. The time frame of this dataset ranges from articles published from 2000 to 2003.

Fig. 1

Inspiration

This project studied what the audience that’s interested in Russia is searching for, what are the main topics of interests, and how this information can help build a more attractive and popular information resource for this audience(Fig. 2). They also used the result to prepare some suggestions for Way to Russia editors as to what content should be presented within the same context to address the audience’s needs.

Fig. 2

Material

Open Refine: a standalone open-source desktop application for data cleanup and transformation to other formats. For this project, I used Open Refine to transfer XML to CSV file.

Google sheets: a spreadsheet software used to explore data(Fig. 3)

Gephi: an open-source network analysis and visualization software package written in Java on the NetBeans platform.

west_bank_18.xml (DyNetML): the West Bank dataset. This data was collected at CASOS. (

Fig. 3

Methods

After collecting the dataset I wanted to visualize in this project, I used open refind to transfer the data from XML to CSV file as well as renaming the column name to get it ready to be imported in Gephi. But it still took me several times to successfully import the dataset in Gephi.

The node position was random at first, so I changed the layout to “Force Atlas”. After trying out different layouts, I decided to use “Force Atlas” and set the “Repulsion strength” at 10,000 to expand the graph. Node’s color and size were set up after calculating the average path length of the network using the “Statistic” feature in Gephi. In order to colorize the cluster in this network, I used the community detection algorithm to create a “Modularity Class” value for each node. Different color sets were generated and applied to “Modularity Class” to further visualize the network. I also created a filter that can hide nodes and edges on the network by applying “Degree Range” to the filter.

The last manipulation step is to show and adjust the label. I also went back to “layout” and ran the  “Adjust label” layout to avoid overlapping. To enhance the readability of the network, autoselect neighbors were set to show the connection between nodes easily  

Result

In order to identify the main keywords related to all the six major terrorist groups in the west bank, I used Gephi visualization engine to visualize the result. The nodes that have the highest betweenness centrality (or the ones that often connect different search contexts together and thus have a high influence in this graph) are bigger in the graph(Fig 4.1).

Fig. 4.1

We can see that “ Israel ” and “Hamas” are the most prominent ones. Then the network splits into several communities of search terms that tend to co-occur together. (Fig. 4.2 & Fig. 4.3)

Fig. 4.2
Fig. 4.3

This data shows the network of keywords related to the six terrorism groups, so I made another graph emphasizing all the labels. The labels that have the highest betweenness centrality are bigger in the graph. (Fig 4.4)

Fig. 4.4

Reflection

When I first came across this dataset, I didn’t know the connection behind the terrorist groups based on raw data. But the connection and hierarchy got so clear after visualizing the dataset in Gephi. I was really amazed by how visualization can reveal the values behind raw data and become useful to further analyzation and suggestions. Despite the fact that Gephi is not as user-friendly as Carto, I still like how it can manipulate the data and the freedom of styling the visualization with Gephi’s various selection of tools.