Introduction
This project looks to visualize how the use of specific photographic processes changes over time. The question arose out of the creation of a timeline of various photographic processes, and my dissatisfaction with the level of detail it was able to provide. The goal was to investigate further each individual process, not merely represented by a single point on a timeline, but rather understand how frequently the process was used over time. The hypothesis was that a given photographic process could be displayed as a frequency histogram, representing the beginning, height, decline, and any potential resurgence of a particular process’ popularity.
Inspiration
Design inspiration came from two primary sources. The first was the concept of data-ink and maximizing the data-ink ratio as introduced by Edward Tufte. The charts produced for this project aim to maximize the data : non-data information and thus promote a minimal aesthetic. Because the data was analyzed and the graphs were designed with a specific question in mind, as opposed to an exploratory view of a dataset, the minimal design approach is most fitting and allows the data speak for itself. The GIF below that was studied in class does an excellent job of visually displaying the concept of the data-ink ratio. Click the GIF to play:
The second design inspiration came from joeycloud.net. The “pianogram” visualization displayed on his website served as inspiration for how to cleanly show multiple variables that exists on the same x-axis scale. In this visualization the quantity of notes in a given song are displayed as a histogram on an axis that resembles a piano. The user can choose the song from a drop down menu and the graphic updates and displays the new histogram. This is same concept I used for displaying photographic processes over time in the primary chart for this project.
The concept inspiration came from identifying the short comings of my previous timeline project as well as through the work of Richard Benson’s “The Printed Picture”. Benson’s work is an incredibly rich source of information, however cannot be consumed quickly. In order to tease out the popularity or life span of a particular process in the book and accompanying website, user’s have to read each chapter individually. My hope was that through this visualization exercise, I could add supplemental material to Benson’s work that could be consumed quickly and lead user’s to further investigation of the material he has covered so well.
Materials
This project used OpenRefine and Microsoft Excel to clean and organize the dataset and Tableau Public to further group and visualize the data. The data was sourced from the Metropolitan Museum of Art’s open dataset of it’s collection.
Methods
In order to begin to try and answer the question at hand, I first had to decide upon a metric by which I could measure the frequency of use/popularity of a photographic process. Because making a photographic print doesn’t inherently leave any form of data point, I decided to look at the makeup of a large photographic collection as an indicator of the use of a particular kind of process. I used the Metropolitan Museum of Art in New York City as the collection I would examine.
The entire museum’s database was downloaded as a .csv file and brought into OpenRefine to examine. The database consisted of approximately 40 columns by 500,000 rows of artwork information. I eliminated all records except for those classified as “photograph” and removed all information concerning these records except for “medium”, “object date”, and “artist”. It is worth noting that the “object date” is the date for which the print is credited with being created, not the date it was collected by the museum. Within the “object date” category I found that there was an extreme amount of variance in how the dates were entered and because of format errors OpenRefine did not identify many of them as dates. I exported the .csv file from Refine and opened it in Microsoft Excel. I used the formula: 1*MID(A1,MIN(FIND({0,1,2,3,4,5,6,7,8,9},A1&”0123456789″)),4) to return the first four digit number in the “object date” column. This .csv was then saved and opened with OpenRefine again. From here the timeline facet was used to remove any records prior to 1839 (the invention of photography) and 2016 (most recent date not past present). The “medium” category also had many variations and inconsistencies and was cleaned accordingly. Once this was complete, this was again exported as a new .csv file and opened in Tableau Public.
In Tableau, the first bar chart created was designed to address the primary question of frequency of use of a photographic process over time. The object dates were grouped by decade and were placed on the x-axis. The number of records present in the data were placed on the y-axis and the filter option was used to allow the user to select a particular photographic process to examine. Along with this the tooltip indicates the absolute value of prints in the collection for a particular process and decade as well as this number expressed as a percentage of the entirety of that individual photographic process.
A second bar chart was produced highlighting which photographic processes are most present in the Met’s collection. In this case the x-axis contains the photographic process and the y-axis shows the number of records expressed as a percentage of the whole. It is presented in descending order and the tooltip indicates the absolute value of the category.
Finally, a tree chart was created showing the top ten artists that have the most prints in the Met’s collection in relation to all other artists in the collection. The tool tip in this chart shows the absolute number of prints for each artist as well as this number as a percentage of the whole.
Results
The primary issue with the concept of these visualizations is that they assume that the frequency distribution of the processes found in the Metropolitan Museum of Art’s photography collection is representative of the actual use of those processes for a given time period. This is potentially problematic for several reasons. First, certainly the majority of prints made with a given process aren’t collected by the museum, rather only those that are deemed worthy by the current curator. Museums such as the Metropolitan Museum of Art aren’t necessarily concerned with collecting a random sample of pieces for a given time period, but rather function on the whims of style and artistic trends. Another potential error in the logic of the Met’s collection being representative of the overall use of a process is the fact that certain artists contribute a disproportionate percentage to the total number of prints. For example, Walker Evans alone contributes almost 8% of the entire photography collection at the Met because he was particularly prolific and highly sought after. The top 10 contributing artists make up almost half of the entire collection. The dataset itself is also questionable as it was extraordinarily inconsistent in the way in which data was entered and was full of errors. For example, before the data was more thoroughly cleaned, it showed gelatin silver prints dating from the 1840s and 50s however this is impossible as this process was not invented until the 1870s. Given that errors may exist in the data beyond what was caught and removed, its validity is slightly circumspect.
That being said, if we are to assume that the Met’s photography collection is at least slightly representative of the frequency distribution of a photographic process, we do see trends that appear to align with historical information. Albumen prints rise to prominence in the 1880s and 90s and all but disappear by the 1940s. Gelatin silver prints dominate the 20th century with spikes in the 30s and 60s, while inkjet prints rise steeply in the 2000s. If we look at all of the processes combined, photography as a whole had popularity spikes in the last half of the 19th century and during the 1970s.
Given the issues with extrapolating the Met’s collection to represent the actual use of a process, these visualizations don’t satisfactorily answer the primary question. They do however offer some insight into the makeup of the Met’s collection, and new questions specific to this data can be formulated.
Future Directions
If this project were to be expanded upon, new questions should be formulated specifically in regards to the Metropolitan Museum of Art’s collection. For example, one could want to know what the geographical distribution of the photography collection is and how that compares to the distribution of non-photographic works. Another line of inquiry would be to compare the makeup of the Met’s collection to other well known institutions to find what areas overlap and which particular specialities each institution contains.
If one wanted to continue to try and understand the frequency distribution of photographic processes more generally, they could attempt to isolate specific sales records of items such as specific types of film/paper/cameras as an indicator of its use and plot those values for specific times periods. This could potentially be a more accurate sample of process use over time.