{"id":4647,"date":"2016-06-07T17:01:29","date_gmt":"2016-06-07T21:01:29","guid":{"rendered":"http:\/\/research.prattsils.org\/?p=4647"},"modified":"2016-06-07T17:01:29","modified_gmt":"2016-06-07T21:01:29","slug":"visualizing-exploring-home-run-rates-baseball-history","status":"publish","type":"post","link":"https:\/\/studentwork.prattsi.org\/infovis\/visualization\/visualizing-exploring-home-run-rates-baseball-history\/","title":{"rendered":"Visualizing and Exploring Home Run Rates through Baseball History"},"content":{"rendered":"<p>As an avid fan of both the sport of baseball, and the statistics which underpin the study and understanding of its history, I thought it would be interesting to examine some of those historical statistics visually.\u00a0 In particular, I am interested to see how the prevalence of the home run has changed over time, and how the profile of home run hitters may have changed in different baseball eras.\u00a0 We have only recently moved beyond what is commonly known as\u00a0the \u201cSteroid Era\u201d of baseball, when the use of performance enhancing drugs supposedly enabled many ballplayers to far exceed the home run output of previous generations, and doing\u00a0so at ages when a player\u2019s career was traditionally nearing its end.\u00a0 To gain some perspective into this topic, I thought it would be interesting to plot how different age groups have performed relative to one another over baseball history.<\/p>\n<p>In contemplating a visualization to represent this, I took note of <a href=\"http:\/\/web.colby.edu\/baseball\/files\/2016\/04\/Blog-Post-3-PDF-1.pdf\" target=\"_blank\">one study<\/a> which followed a similar premise.\u00a0 Seeking to analyze the differences between baseball eras, the group performing this study created several scatterplots representing top-10 performances in home runs, runs batted in, and on-base percentage for each year between 1900 and 2014.\u00a0 The eras are differentiated by using different color points.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/06\/Home-Run-scatterplot.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-4648 aligncenter\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/06\/Home-Run-scatterplot-620x440.png?resize=620%2C440\" alt=\"Home Run scatterplot\" width=\"620\" height=\"440\" \/><\/a><\/p>\n<p>This is visually quite accessible.\u00a0 However, I also considered another method to articulate historical eras, using \u201cregimes,\u201d or bands of color to represent the different eras.\u00a0 I noted an example of this format at <a href=\"http:\/\/blogs.sas.com\/content\/graphicallyspeaking\/2012\/01\/23\/timeseries-plots-with-regimes\/\" target=\"_blank\">this data visualization blog<\/a>, and this kind of banding appears an effective way of representing eras when using a line graph.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/blogs.sas.com\/content\/graphicallyspeaking\/files\/2012\/01\/Housing_31.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/i0.wp.com\/blogs.sas.com\/content\/graphicallyspeaking\/files\/2012\/01\/Housing_31.png?resize=840%2C504\" width=\"840\" height=\"504\" \/><\/a><\/p>\n<p>To add further context, I thought it might be useful to include some significant home run achievements through baseball history, and considered <a href=\"http:\/\/www.amazinavenue.com\/2012\/6\/10\/3075978\/mets-avoid-blowout-but-lose-to-yankees-again\" target=\"_blank\">another\u00a0example<\/a>, a baseball visualization created by Fangraphs, which I found to be particularly effective at demarcating both individual regimes and significant events along the timeline.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/assets.sbnation.com\/assets\/1174327\/2012-06-09-fangraphs.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/i0.wp.com\/assets.sbnation.com\/assets\/1174327\/2012-06-09-fangraphs.png?resize=590%2C375\" width=\"590\" height=\"375\" \/><\/a><\/p>\n<p>To create my visualization, I began by retrieving a CSV data table from the Fangraphs website (<a href=\"http:\/\/www.fangraphs.com\/\" target=\"_blank\">www.fangraphs.com<\/a>).\u00a0 Fangraphs allows users to customize data reports for export, and I used this feature to generate a table listing the total home runs and plate appearances, as well as the age of each individual player in each Major League Baseball season between 1871 and the season currently underway, 2016.\u00a0 I then modified the table in Excel, creating an additional column for home runs per plate appearance (HR\/PA), which calculation I applied to each row.\u00a0 I imported this CSV into Tableau Public 9.0, which I used to create the visualization.<\/p>\n<p>Rather than use home run totals for top players, as in the study above, I decided to look at the rate of home run production across the league, as represented by HR\/PA.\u00a0 This should present a more generalized picture of the home run environment throughout history, without being skewed as much by exceptional individual players, or historical differences in the length of baseball seasons.\u00a0 Because my analysis reports an average of a rate (HR\/PA), I limited my data set to \u201cqualified\u201d players in order to avoid distortions caused by players with very few plate appearances.\u00a0 <a href=\"https:\/\/en.wikipedia.org\/wiki\/Plate_appearance#Scoring\" target=\"_blank\">A qualified player<\/a> must accrue 3.1 plate appearances per team game in a season, which essentially restricts this data set to\u00a0full-time players.<\/p>\n<p>I plotted average HR\/PA rates in two separate line graphs, one representing the overall average from season to season, and one representing the average rate within four separate age groups.\u00a0 The age groups I selected represent what is typically considered a player\u2019s career prime (25-29), a player\u2019s post-prime (30-34), as well as very young players (under 25) and players at the end of their careers (35+).\u00a0 For the age group graph, I reported a 5-year running average, including the given year and two years on either side.\u00a0 This smoothed out some of the year-to-year variance within each age group, making it\u00a0easier to make comparisons between age groups.<\/p>\n<p>After these two graphs, I included a third line graph to indicate what percentage of the player population each age group made up in each year.<\/p>\n<p>I then added bands to each graph representing generally agreed-upon baseball eras, (derived from the above study, as well as from <a href=\"http:\/\/www.netshrine.com\/era.html\" target=\"_blank\">this site<\/a>,) and several lines demarcating significant single-season home run achievements.<\/p>\n<p><a href=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/06\/MLB-HR-per-PA.png\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-4652 aligncenter\" src=\"https:\/\/i0.wp.com\/studentwork.prattsi.org\/infoshow\/wp-content\/uploads\/sites\/2\/2016\/06\/MLB-HR-per-PA-620x462.png?resize=620%2C462\" alt=\"MLB HR per PA\" width=\"620\" height=\"462\" \/><\/a><\/p>\n<p>The results suggest a number of storylines.\u00a0 Most obviously, there seems to be a visible\u00a0correlation between different baseball eras and changing trends in home run rate, with the overall trend being increased home runs.\u00a0 It does not come as a huge\u00a0surprise that home run rates correlate to eras, as those eras are rather arbitrarily defined, with one of the major predicators of their definition being shifts in home run production.\u00a0 To a large extent, then, this is a self-fulfilling prophecy, though it is gratifying to see it illustrated so clearly here, and it is still interesting to observe the overall changes from era to era.<\/p>\n<p>It is also interesting to note that, until World War II, younger players, particularly under 25, showed noticeably stronger home run rates than older players, whereas after World War II this gradually flips, with players under 25 consistently showing the lowest home run rates in the 2000s. \u00a0Prior to\u00a0the Live Ball Era, this may partly be because home runs reflected the speed of a player as much as being able to knock it over the fence, and speed is generally considered\u00a0a young person&#8217;s skill.\u00a0 One could speculate that the recent reversal may be due to better physical training available for more experienced players, as well as the effects of drug use in preserving and enhancing their strength and health.<\/p>\n<p>Finally, I note several spikes in the relative production of players 35+ compared to other age groups.\u00a0 Because the 35+ group is consistently the smallest segment among the player population, I would speculate that this is due to a small number of extraordinary individual players entering that age group at the end of their careers, thereby ballooning the average rates for the group as a whole.<\/p>\n<p>Based on some of these observations, it might be interesting to develop this visualization further, and include some interactive features.\u00a0 For one thing, the question of the impact of individual players on overall trends could be made explorable by developing a filter to add individual player rates in addition to the group rates displayed here.\u00a0 It would be interesting to see how Hank Aaron\u2019s home run production compares to his contemporaries\u2019 production, especially as he progressed through different age ranges.<\/p>\n<p>Furthermore, it would be interesting to extend this visualization to other statistical categories, to analyze similar\u00a0kinds of shifts to what\u00a0we can observe here.\u00a0 An effective display would probably be limited to selecting one statistical category at a time, but baseball has many to choose from, and this could make for a very interesting tool to peruse.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>As an avid fan of both the sport of baseball, and the statistics which underpin the study and understanding of its history, I thought it would be interesting to examine some of those historical statistics visually.\u00a0 In particular, I am interested to see how the prevalence of the home run has changed over time, and&hellip;<\/p>\n","protected":false},"author":171,"featured_media":4652,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[1],"tags":[78,79,80,81],"coauthors":[],"class_list":["post-4647","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-visualization","tag-baseball","tag-baseball-history","tag-baseball-statistics","tag-mlb"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/paBdcV-1cX","_links":{"self":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/4647","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/users\/171"}],"replies":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/comments?post=4647"}],"version-history":[{"count":0,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/posts\/4647\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/"}],"wp:attachment":[{"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/media?parent=4647"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/categories?post=4647"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/tags?post=4647"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/studentwork.prattsi.org\/infovis\/wp-json\/wp\/v2\/coauthors?post=4647"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}