{"id":1150,"date":"2016-05-09T21:26:37","date_gmt":"2016-05-10T01:26:37","guid":{"rendered":"http:\/\/dh.prattsils.org\/?p=1150"},"modified":"2016-05-09T21:26:37","modified_gmt":"2016-05-10T01:26:37","slug":"define-dh-a-textual-analysis","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/dh\/2016\/05\/09\/define-dh-a-textual-analysis\/","title":{"rendered":"Day of DH: A Textual Analysis"},"content":{"rendered":"<p>What is digital humanities? There is no single agreed-upon answer amongst practitioners or members of the field. Many note that it is less a unified discipline than a series of methods and practices that share common values (Spiro, 2012; Burdick, 2012; Presner, 2009). One common value amongst the digital humanities crowd is openness and contributions from many, much as in the spirit of the internet (Spiro, 2012). In that spirit of crowdsourcing and ground-up contributions, the Day of DH, an online project, \u201cexamines the state of the digital humanities through the lens of those within it\u201d (Day of DH). Participants come together one day a year to individually document their DH activities for the day, both\u00a0via social media sites such as Twitter and on the Day of DH website. Each year, participants\u00a0are also asked to answer the question \u201cHow do you define digital humanities?\u201d and asked to respond in an open-answer format. Using this data set of user-provided definitions of DH, this paper seeks to identify the methods, practices, and values that converge to form digital humanities, as well as to discover any evolutions in that composition over the five-year span of the data.<\/p>\n<p><strong>Methods<\/strong><\/p>\n<p>Data collected by Day of DH moderators in response to &#8220;How do you define digital humanities?&#8221; is provided for free at\u00a0whatisdigitalhumanities.com. This data set was downloaded in .csv format. First, the entire dataset was run through Voyant, the online text analysis tool, in order to analyze word frequency for the entire corpus. Due to the fact that later analysis would be run on each year individually, and each year\u2019s document was drastically different in length, all measurements of word frequency were taken using relative frequency (i.e. the frequency of occurrence out of a scaled-up corpus of 1M words) as opposed to raw count (i.e. the actual frequency of occurrence), and rounded to the nearest whole numbers. Common English language stop words were used; the terms \u201cdigital,\u201d \u201chumanities,\u201d and \u201cdh\u201d were also removed from the dataset, as they were outliers in terms of frequency and also deemed to be unrevealing in terms of textual insight, given that they were most often used in the context of referring to that which respondents were asked to define. Generic works such as \u201cuse,\u201d \u201cwork,\u201d and \u201cway\u201d were also removed. In order to be more encompassing and to more accurately reflect the use of these words in different forms across the document, terms were converted to strings as applicable. For instance, <em>technology <\/em>and <em>technologies <\/em>were merged into the string <em>technolog*<\/em>, which then included such terms as <em>technological <\/em>and<em> technologically. <\/em>Similarly, <em>research<\/em> became <em>research<\/em>* to include <em>researching<\/em> and <em>researchers<\/em>; <em>methods<\/em> became <em>method*<\/em> to include <em>methodological<\/em> and <em>methods<\/em>; <em>computing<\/em> became <em>comput<\/em>* to include <em>computers<\/em>,<em> computational<\/em>, etc.<\/p>\n<p>The top twenty words were then coded as describing either <em>M<\/em><em>ethod<\/em>,<em>\u00a0Practice<\/em>, or <em>V<\/em><em>alue<\/em>. The categorizations are outlined in Table 1 in the Findings section, below.<\/p>\n<p>Given that the dataset specified the year of collection for each response, data were then separated into individual documents by year in order to run individual analysis on each time period. Each year\u2019s collection of definitions was analyzed, again using Voyant, using the same stop words as with the main dataset. Likewise, words in different formats and tenses were combined using appropriate strings. Word frequencies were taken for each word in each year and graphed to show the evolution of that term across the entire corpus.<\/p>\n<p><strong>Findings and Discussion<\/strong><\/p>\n<p style=\"text-align: center\">Table 1: Category, Term, and Relative Frequencies of Top 20 Terms in Corpus<\/p>\n<table style=\"height: 1443px\" width=\"598\">\n<tbody>\n<tr>\n<td width=\"67\"><strong>Category<\/strong><\/td>\n<td width=\"111\"><strong>Term<\/strong><\/td>\n<td width=\"120\"><strong>Relative Frequency (out of 1M)<\/strong><\/td>\n<\/tr>\n<tr>\n<td width=\"67\">Methods<\/td>\n<td width=\"111\">Technolog*<\/td>\n<td width=\"120\">11416<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Comput*<\/td>\n<td width=\"120\">9350<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Tools<\/td>\n<td width=\"120\">7792<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Method*<\/td>\n<td width=\"120\">7114<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Questions<\/td>\n<td width=\"120\">3794<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Media<\/td>\n<td width=\"120\">3117<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Information<\/td>\n<td width=\"120\">2033<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Data<\/td>\n<td width=\"120\">1931<\/td>\n<\/tr>\n<tr>\n<td width=\"67\">Practices<\/td>\n<td width=\"111\">Research*<\/td>\n<td width=\"120\">8876<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Study|Studies<\/td>\n<td width=\"120\">5895<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Scholarship<\/td>\n<td width=\"120\">3151<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Field<\/td>\n<td width=\"120\">4099<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Discipline(s)<\/td>\n<td width=\"120\">3455<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Teach*<\/td>\n<td width=\"120\">2337<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Application<\/td>\n<td width=\"120\">2066<\/td>\n<\/tr>\n<tr>\n<td width=\"67\">Values<\/td>\n<td width=\"111\">Cultur*<\/td>\n<td width=\"120\">13492<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">New<\/td>\n<td width=\"120\">8029<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Human|humanistic<\/td>\n<td width=\"120\">4980<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Traditional<\/td>\n<td width=\"120\">3442<\/td>\n<\/tr>\n<tr>\n<td width=\"67\"><\/td>\n<td width=\"111\">Knowledge<\/td>\n<td width=\"120\">2168<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p>Of the top twenty most frequently occurring terms in the corpus (see Table 1), nearly half (8 out of 20) described <em>Methods<\/em> within the field, defined here as anything that is used in the pursuit of digital humanities. Within this category, the highest ranking strings, <em>technolog* <\/em>(relative frequency of 11,416 per million words), followed by <em>comput* <\/em>(relative frequency of 9,350), speak to the highly digital and technological nature of the discipline.\u00a0<em>Tools <\/em>(7,792) and <em>method* <\/em>(7,114), rather general terms that can easily describe a wide range of offerings, follow as the 3<sup>rd<\/sup> and 4<sup>th<\/sup> highest ranking in <em>Methods <\/em>and suggest the breadth of ways that scholars might engage within digital humanities.<\/p>\n<p>The category of <em>Practices<\/em>, here defined as a process that makes use of the methods from the previous section, accounts for almost as many high frequency terms (7 out of 20) as do <em>Methods<\/em>. Its top term (<em>research*<\/em>, relative frequency = 8876) is an outlier to the rest of the set and anchors digital humanities in the scholarly and structured practice of research. Interesting to note is that the term <em>field<\/em> (relative frequency = 4,099) is more frequently occurring than <em>discipline(s) <\/em>(relative frequency = 3,455), suggesting that digital humanities is, as Spiro (2012)\u00a0suggests, viewed as a convergence of various pursuits rather than a traditionally structured discipline. It should also be noted that <em>discipline(s) <\/em>often occurred in the context of \u201cintersect[ing] disciplines,\u201d or \u201c\u2026paving the way for dialogues to occur across disciplines,\u201d suggesting that a portion of these frequencies do not link digital humanities to being a discipline. Some even claim the negation of it being a disciplines, saying, \u201cI continue to regard digital humanities as a set of methods\u2026 as opposed to a full-blown discipline,\u201d and \u201c\u2026digital humanities is not a discipline, per se, but a set of beliefs, theories, practices, methods, and artifacts\u2026\u201d.<\/p>\n<p>The relatively low occurrence of terms describing <em>Values <\/em>within the top twenty terms of the corpus is of particular interest given the focus that values receive in Spiro\u2019s (2012) influential writings on the topic, and the rather emotionally charged declaration of a \u201ccall to action\u201d by Presner (2009). The values that Spiro includes in her manifesto for digital humanities, for example\u2014openness, collaboration, collegiality and connectedness, diversity, and experimentation\u2014are not explicitly represented in the highest ranking terms of this corpus.<\/p>\n<p>Rather than refuting Spiro\u2019s top values, however, the project from which this corpus evolved and the very nature of the data it holds (800+ unfiltered definitions, experimental and diverse in nature), the authorship (collaborative and voluntary), accessibility and connectedness (free on the web), personify these values in a living way rather than explicitly making them visible through textual analysis. These values are embodied and felt, rather than plainly seen.<\/p>\n<p>Also of interest in the realm of values is the presence of the term\u00a0<em>traditional<\/em> with a relative frequency of 3,422 per million words. This is perhaps surprising given that much of digital humanities seems to break with tradition. However, in analyzing context of the term, it was found to\u00a0serve two purposes &#8211; first, to set digital humanities apart from traditional scholarship, as in: &#8220;the creation and sharing of scholarship&#8230;in ways not possible in the traditional humanities,&#8221; but also to anchor DH in a traditional history while adding a new and technological layer to these forms of scholarship, such as: &#8220;capturing the passion for the traditional path and emboldens it&#8230;&#8221; or the &#8220;application of digital tools to traditional humanities research.&#8221; The duality of this term, both setting DH apart from a history of traditional scholarship but also embedding within it and standing on it shoulders, suggests a complexity and dynamism to DH.<\/p>\n<div class='tableauPlaceholder' id='viz1472594335837'><a href='#'><img alt='Sheet 1 ' src='https:&#047;&#047;public.tableau.com&#047;static&#047;images&#047;Wo&#047;WordFrequenciesbyYear2DefineDH&#047;Sheet1&#047;1_rss.png' style='border: none' \/><\/a>  <\/div>\n<p>                                    var divElement = document.getElementById(&#8216;viz1472594335837&#8217;);                    var vizElement = divElement.getElementsByTagName(&#8216;object&#8217;)[0];                    vizElement.style.width=&#8217;100%&#8217;;vizElement.style.height=(divElement.offsetWidth*0.75)+&#8217;px&#8217;;                    var scriptElement = document.createElement(&#8216;script&#8217;);                    scriptElement.src = &#8216;https:\/\/public.tableau.com\/javascripts\/api\/viz_v1.js&#8217;;                    vizElement.parentNode.insertBefore(scriptElement, vizElement);<br \/>\nIn analyzing the data by year in order to examine any evolution or trend in the ways that practitioners are defining digital humanities, peaks representing the year during which a given term was most frequently mentioned, or valleys during which its mention sharply dropped, may provide insight into contemporary projects or circumstances. While further analysis would be necessary in order to determine any influential factors, it is interesting to note some of the trends represented in Figure 1 below.<\/p>\n<p>While highest ranking strings like <em>comput* <\/em>began as the highest ranking <em>Method<\/em> in the document of definitions from 2009, it dropped sharply towards 2012 and only ended up somewhere in the middle of the pack in 2014. Similarly, <em>technolog*<\/em>, while still ranking highest in <em>Methods <\/em>from 2011-2014, shows a decline in frequency from 2011 on. These trends might be explained by these methods becoming ingrained within the field and almost implicit within its definition. True to its status as an outlier within the <em>Practices<\/em> category of the overall corpus, the trend line for <em>research<\/em> floats above the other strings in its category for the span of the data. While <em>Values<\/em> may account for a relatively smaller portion of high-frequency words in the overall corpus, the trend lines for terms in this category, especially <em>new <\/em>and <em>human|humanistic<\/em>, suggest that perhaps these discussions are on the rise.<\/p>\n<p><strong>Conclusion<\/strong><\/p>\n<p>Although there may not be widespread consensus as to a working definition of digital humanities, the visions\u00a0discussed in relevant literature (Spiro, 2012; Burdick, 2012; Presner, 2009), namely that digital humanities is the convergence of methodologies and practices that align under certain values, is born out in the textual analysis of 800+ crowdsourced definitions of DH.<\/p>\n<p>Dominant terms and strings in the overall corpus (<em>cultur*,\u00a0<\/em><em>technolog*, comput*, Research*,<\/em>\u00a0<em>New,<\/em><em>\u00a0Tools, Methods<\/em>) reiterate DH&#8217;s\u00a0diversity in projects, commitment to\u00a0technology&#8217;s ability to mine traditional fields for innovative research, as well as its base in traditional and structured scholarship. Given the nature of the Day of DH project, which seeks to expose quotidien activities of digital humanists, these dominant terms understandably focus on activities and practices rather than values, but they do illustrate the breadth of action within the field.<\/p>\n<p>In examining the corpus by year, we are able to see a movement in technological terms from extremely dominant to more ingrained and implicit within discussions of DH practices, while practices such as\u00a0<em>Research<\/em>, and values such as\u00a0<em>New<\/em>, remain outliers within their respective categories throughout every year. These demonstrate certain commitments across the diverse projects and foci of people participating in the field.<\/p>\n<p>In looking at the corpus as a whole, as well as each year&#8217;s document individually, dominant methods and practices have emerged that prove both DH&#8217;s relationship to and place in traditional scholarship as well as its new offerings and complexities.<\/p>\n<p><strong>References<\/strong><\/p>\n<p><span style=\"font-weight: 400\">Burdick, Anne, et al. (2012). \u201cDigital Humanities Fundamentals\u201d in <\/span><i><span style=\"font-weight: 400\">Digital_Humanities,<\/span><\/i><span style=\"font-weight: 400\"> pp. 122\u201323<\/span><\/p>\n<p><span style=\"font-weight: 400\">Presner, Todd, et. Al (2009) \u201cDigital Humanities Manifesto 2.0\u201d <\/span><a href=\"http:\/\/www.toddpresner.com\/?p=7\"><span style=\"font-weight: 400\">http:\/\/www.toddpresner.com\/?p=7<\/span><\/a><\/p>\n<p><span style=\"font-weight: 400\">Spiro, Lisa (2012). \u201c\u2019This is Why We Fight\u2019: Defining the Values of the Digital Humanities\u201d in <\/span><i><span style=\"font-weight: 400\">Debates in the Digital Humanities.<\/span><\/i><\/p>\n","protected":false},"excerpt":{"rendered":"<p class=\"lead\">What is digital humanities? There is no single agreed-upon answer amongst practitioners or members of the field. Many note that it is less a unified discipline than a series of methods and practices that share common values (Spiro, 2012; Burdick, 2012; Presner, 2009). One common value amongst the digital humanities crowd is openness and contributions from many, much as in&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"btn btn-danger\" href=\"https:\/\/studentwork.prattsi.org\/dh\/2016\/05\/09\/define-dh-a-textual-analysis\/\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":351,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,7],"tags":[],"class_list":["post-1150","post","type-post","status-publish","format-standard","hentry","category-projects","category-student"],"_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/posts\/1150","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/users\/351"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/comments?post=1150"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/posts\/1150\/revisions"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/media?parent=1150"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/categories?post=1150"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/dh\/wp-json\/wp\/v2\/tags?post=1150"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}