TV Archive

To observe how fake news was discussed in television research material was collected from Internet Archive’s TVNews Archive. The collection has closed caption texts of US TV News shows from year 2009 to the present.

Data on newscasts discussing “fake news” were collected from the TVNews Archive using the Internet Archive’s Advanced Search function and a Ruby script. About 4,814 newscasts were collected from the time period November 7, 2016 – March 7, 2017. Data were cleaned and reformatted for analysis using Open Refine. Special attention was paid to show, station, topic, and date fields, as well as the snippet, a short excerpt of the transcript surrounding the first occurrence of “fake news.” Term frequency data per day was calculated from the snippets using R (stopwords were removed and terms were stemmed).

The TVNews dataset was analyzed using Tableau Public. Three aspects were observed throughout the time period: total number of records, the most active stations, and the most often mentioned terms/topics.