Lab 2, working with Tableau, was challenging but also more rewarding than working with the slideshow in our first lab. Tableau was really easy to work with, and allows for more customization in creating your visuals. Tableau is also helpful because it does not allow you to mess up too much. When you have created a successful visualization, or one that makes sense, you can tell easily. When you have created something that doesn’t make sense, the visualization comes out looking wonky. In this way, Tableau is fool-proof when you’re making simple visuals or are working with a good, wide-spanning dataset—as this one is.
Discussion, Materials, and Methods
The first visualization I made was a bubble graph representing each film’s budget and gross. You see each of these traits by hovering over clustered bubbles, shaped and colored depending on their respective gross/budget. The larger the bubble, the larger the budget of the film. The darker the shade of green, the more money the film grossed. Some might say that this graph is too complex. If it were solely on the page, say, in a textbook, I would agree. But I think the interactive nature of hovering over the bubbles to reveal the metadata makes it fun if you are interested in movie profits. If this is the case, you can get lost scrolling over all the bubbles. I just wish the graph and the bubbles could be made bigger, in hindsight.
The time-measured visual component of this lab was fulfilled by a graph mapping notable films–spanning 1927’s Metropolis to 2016’s Suicide Squad. Gross is the y-axis in this graph. The years make up the horizontal part of the graph. With one look, we see which films sank to the bottom of the graph (ones that weren’t as financially successful. The more successful films of that year rise to the top. With this graph, you get a comprehensive list of films from each year. You can also see them ordered, nicely, from lowest box office intake to highest.
Weaknesses in the dataset, unfortunately, become evident in this graph. John Water’s classic, Pink Flamingoes, only has a gross of like 160k. Metropolis, from 1927, has a gross of 27,000 dollars. But a film like Gone with the Wind had a 200 million dollar gross. The data is inconsistent—sometimes it is adjusted for inflation, other times not. Whoever compiled the data seemed happy to take whatever numbers came up in a cursory search. No judgment, though.
The third graph for my lab is my favorite. For this graph, I took the Actor 1 category, which is usually an indisputable lead. This allowed for a good crop of leading actors, who are ranked based on how much ALL their movies have made. With this bar graph, you can see how much each actors film budgets are, but also how much they have grossed. With this data, I created a graph that allows us to see if the actors listed are “worth” their asking prices. I’m sure people in film create similar tasks to determine whether to meet an actor’s “quote” or whatever.
This graph feels important because it touches on a lot of wage gap issues too. I didn’t have to do anything with this data but plug in actor, budget, and gross categories, and this listing of actors and the money associated with those films comes up. There are side-by-side budget and gross graphs for each actor. This created a large and some might say unwieldy graph, but I don’t think so. If anything, it shows how silly and erroneous it is to have twenty “box office-y” male actors be on this list before the first female actor shows up. Despite this datasets missing values and other shortcomings, this graph this made feels super accurate unfortunately.
My data was a fairly-exhaustive list of films from the 1920’s to now. Metadata for each film was categorized by year released, director, gross, budget, the three primary actors, etc. My limitations with this program became evident with this last category. Each film has an “Actor 1,” “Actor 2,” and “Actor 3” category. With thousands of names under each heading, it makes a lot of sense for those categories to be one to create a more-comprehensive. I was unable to figure out how to do this, or if it was even possible. I would’ve liked to do more actor-related visuals, but the fact that I couldn’t do it without having clunky, separate “actor 1,” “Actor 2,” and “Actor 3” columns made me shy away from the more complete list. Anytime actor data was used, I just used the first actor category.
I like my visualizations and the pastel colors. One thing I like about this field is that it often requires simplicity. It gives you a good excuse to be smart, but lazy. I like that my simple bar graph on Gross-By-Film-Year can be a very viable way to tell the ‘story’ I wanted each viz to tell. My bubble graph is a little too big and too exhausting for anyone who doesn’t care about box office returns. But I enjoy looking at it, and would wager that films buffs might loss some time scrolling over each bubble as well.
It was a great learning experience. With more time, I would definitely find a way to make the visuals more layered and appealing to the eye. While I like my simplistically-colored visuals, there is a better way to use color tastefully in order to enhance the story. But, who knows, maybe having a green bubble graph that’s all about money is the best way to go.
This lab taught me that the exhaustive cleaning related to cleaning a large data set can be a downside of rich, great information. There were a lot of missing values and that irked me. There were also at least a dozen figures in each metadata category that I had to exclude because they made no sense, or merely conflicted with the story I wanted to tell.
Also, sometimes the grosses were inconsistent—sometimes, for example, an amount would reflect inflation. Other movies wouldn’t. Also, at times, the actors categories didn’t appear to measure screen time. For example, the actors they placed in Actor 2, might not be the actor in the film with the second-most screen time. It could very well be a very famous notable cameo, like Britney Spears in Fahrenheit 9/11. These inconsistencies in the data were problematic, but I think it still led to some great visualizations that, despite the lapses in the data, all tell helpful, accurate stories.
Gross/Budget Bubble Graph
Film Gross by Year
Budget/Gross by Actor