Background and Goals
In learning about network visualizations during Digital Humanities this semester, I often related what I was learning to my job as the archival processing assistant at the Jewish Theological Seminary (JTS). I have been working on a grant-funded project to create records in Archivists’ Toolkit and in the Library’s catalog to generate finding aids of our archival collections and to make them discoverable through the online catalog. Through this project we have uncovered many collections that had fallen into our backlog that we are now making available. Based on these new discoveries, I wanted to create a network visualization using Gephi to show the relationships between some of our archival collections. The goal is for it to be a discovery tool for our users to see previously unknown connections amongst collections, which may in turn lead to the exploration of more of what the archive has to offer.
Methods and Considerations
Since JTS has well over 300 archival collections, all at varying stages of processing, I had to make many decisions about which collections to include in my visualization. I wanted to do my visualization based on the correspondence within our collections between individual rabbis, cantors, and other prominent members of the Jewish community. However, because many of these collections are at varying stages of processing and are naturally of varying size and content, many do not have lists of correspondents or only have partial lists of correspondents. I decided to leave those collections out. I culled the correspondence lists to only include those correspondents whose archival collections we have in the archives. It should also be noted that only using the correspondents lists is not the only or perfect way relationships can be shown. For example, we have some collections of rabbis who had served at the same synagogues but at different times, and didn’t necessarily correspond with each other. Though this dataset excludes many archival collections and many facets of those collections, I’m hoping that it will show meaningful connections between the collections that are included. Based on these constraints, I ended up using 23 different archival collections, which resulted in 113 nodes and 249 edges. The collections I used were:
Cyrus Adler Papers
Alexander Marx Papers
Bernard G. Richards Papers
Samuel M. Cohen Papers
Simon Greenberg Papers
Herman Rubenovitz Papers
Moses Hyamson Papers
Jacob B. Agus Papers
Zadoc Kahn Papers
Simcha Kling Papers
Israel Lebendiger Papers
Israel H. Levinthal Papers
Henry Pereira Mendes Papers
Judah Nadich Papers
Menachem Ribalow Papers
Saul Lieberman Papers
Heinrich Schalit Papers
Solomon Schechter Family Collection
Morton Smith Papers
Bernard Witt Papers
Max Wohlberg Papers
Marjorie Wyler Papers
Richard Yellin Papers
Before import into Gephi, I set up an edges table in Excel with the column headings Source, Target, and Type. The source was the person sending the letter, the target was the person receiving the letter (the subject of the collection), and all of the relationships were directed. I pulled the lists of correspondents from the inventory lists of the various collections, most of which can be found online [but because we are currently in the process of making these available online in the form of finding aids, some may not be available just yet]. These finding aids can be found by searching in the JTS Library’s catalog.
Prior to planning my project, I looked at other network visualizations of historical sources. The two projects I looked at were “Mapping the Republic of Letters,” created at Stanford, and “Visualizing Historical Networks,” created by The Center for History and Economics at Harvard. I found “Mapping the Republic of Letters” to be very inspirational and informative about the power of network visualization and the wide arrange of tools that can represent it. However, their massive datasets and use of various tools proved to be out of the scope of my personal project. So though this project can be something to aspire to, I wanted to find a project that would be more manageable to accomplish; I found this in “Visualizing Historical Networks.” The Center for History and Economics at Harvard developed this site and has created network visualizations using various historical sources using Gephi. In their “Methods,” they mention a dual use of network visualizations of historical information: creating a reference of connections in a different way than traditionally used, and making patterns in the network more obvious, both already understood and new patterns. These are the two goals I have for my project with the JTS archival correspondence. I want this to be a resource that researchers and other users can refer to in order to see relationships between collections, but also as a resource to analyze old and discover new connections.
Another source I used to get a handle on approaching a network visualization project was Manuel Lima’s “Information Visualization Manifesto” (2009). In it he discusses how visualization has become a popular trend and because of this the focus has moved from data (“Information Visualization”) to aesthetics (“Information Art”). He argues that though the two are not mutually exclusive, one must consider the context, audience, and goals when designing a project. His manifesto goes on to list various principles that should be followed in order to create a meaningful information visualization. I referred to this list in determining the kind of information that would be beneficial to include in my report. Some of his principles I found most informative were start with a question, cite your source, the power of narrative, and aspire for knowledge.
While creating and analyzing the visualization, one characteristic I was interested in exploring was modularity, the different communities that appear in the data. I ran the modularity with the resolution of 1 which resulted in the modularity resolution of .377, and 7 communities (Figure 1).
One community that appeared was one that consisted of a majority of Cantors (Figure 2), the officials who lead the songs and prayers in the synagogue. Though it makes sense that these men would fall into the same community because of their shared profession, it was interesting to see that the data also showed this connection. A similar example to this is Louis Finkelstein and Marjorie Wyler falling into the same community (Figure 3, highlighted in blue). Marjorie Wyler was an employee of JTS who worked very closely with Louis Finkelstein, the chancellor of the Seminary. So again, in reality it makes sense that these two would be in the same community, and so it is really satisfying that the data also shows that connection.
Another community that looked familiar to me is the one represented in Figure 4. Based on my prior knowledge of the collections, I inferred that those represented in this community consisted of Rabbis and other prominent Jewish personalities who were active in the early 1900s, and who were born and active in Europe prior to coming to the United States. In order to verify this, I had to do additional research into these collections. I was able to confirm that all of these men were born in Europe, for the most part in England, in the mid-late 19th century. Many came to the United States and were active at the Seminary as professors or administrators later on in their lives.
It should be noted however, that even though people occur in the same community, it doesn’t mean those were the only people they communicated with or that all of the people in one community even corresponded with each other. For example in this community, Solomon Schechter did not correspond directly with Zadoc Kahn, but the two are connected through their correspondence with Elkan N. Adler, Hermann Adler, and Heinrich Graetz. Also, Solomon Schechter has connections with correspondents in four other communities outside of his own. Because of these limitations, the network must be looked at using other methods as well.
Another way I wanted to represent the network was by showing those with the highest degrees, the ones with the most connections (Figure 5).
I want to be clear here again, that the results are contingent on the data I selected. Therefore, the collections with the largest degrees, are only the largest within this specific dataset. In Figure 5, I have colored those nodes with degrees between 71 (highest degree) and 7. I left the nodes with a degree of less than 7 grey; the majority of these only have degrees of 1 or 2. Though one could infer how large a network is based on the correspondence lists alone, a network like this visualizes those connections, and allows for the comparison of one collection to another. This visualization also shows, in some cases, how large some networks are, in particular that of Alexander Marx (Figure 6). His ego network shows 69 nodes that are connected with him. This shows how far-reaching his influence was as Librarian and professor at JTS. Additionally, this ego network can be used by those researching Alexander Marx, as this will direct them to other collections housed in the JTS archives.
Conclusion: Challenges and Future
I really enjoyed the challenge of discovering the various features of Gephi, especially as a beginner. There are still many features that I’m sure I missed and could have used, but I think for the purposes of this small-scale project, I was able to get the gist of the program and its capabilities. One challenge in particular that I faced was the small scope of my dataset. Because it is not representative of the JTS archives as a whole, I was concerned about the how meaningful the results would be. I decided that the best thing to do was to be clear about the limitations of the dataset and what that means for the results. Another challenge I faced was in making meaning out of the visualizations. Having worked with many of these collections I have basic knowledge about many of the collections. However, in analysis, I learned that it would have been beneficial to have an even better understanding about the people whose collections I was working with. Though I was able to recognize relationships based on my prior knowledge, I’m sure that those more familiar with these people and their lives, will be able to make even more connections and meaning out of the visualizations.
In the future, I would like to expand on this network to include more collections. To do so I would want to create full correspondents list for all collections with correspondence. I also think it would be interesting to create networks based on additional information, like the person’s role in the Jewish community, whether they were a rabbi, cantor, JTS faculty or staff. It might also be interesting to create a network of all correspondents regardless of whether JTS has the collections of those people or not. This could lead people to research elsewhere, but could also help JTS determine if there is a need that could be met. For example, if this kind of visualization were to show that many of our collections are connected with one individual in particular, it may be worth it for JTS to consider acquiring that person’s collection. Also in the future, I would like to make an interactive map available to the users so they can manipulate the network based on their specific needs. For example, they could zoom in and out of the network and isolate specific nodes to reveal certain connections. And in the spirit of digital humanities, I would really like to see a collaborative project with a faculty member or a class in the analysis of this existing visualization, or in the creation of new ones. Projects like this have the potential of enhancing research, scholarship, teaching, and learning, bringing more attention to the archival holdings, and encouraging future funding opportunities. I’m hoping this project will be utilized as a research and discovery tool, as well as a springboard for future scholarship.
Library of the Jewish Theological Seminary of America. http://www.jtsa.edu/library.
Lima, M. (2009, August 30). Information visualization manifesto. Posted to http:// www.visualcomplexity.com/vc/blog/?p=644.
The Center for History and Economics at Harvard (2015). “Visualizing Historical Networks.” Retrieved from http://www.fas.harvard.edu/~histecon/visualizing/index.html.