I was recently asked to take a look at a dataset on BuzzData that outlined Major League Baseball salaries for all players spanning from 1988-2010. Before I get into the crux of it, BuzzData is an open source platform for people, companies, organizations to host, converse, post related articles, debate, create visualizations, download, fork, clone, upload and download data on any topic, idea, or event that you can possibly think of. Beers on tap, marijuana prices, forest fires, hurricanes, ecological & environmental concerns, NASA studies, United Nations research… you name it, Buzzdata awaits data from one and all. For coders and programmers, think of BuzzData as GitHub for data. For the non-Github & geeks of the world, think of it sort of as a much simpler sexier Facebook of sorts for spreadsheets (I dislike the fact I am using Facebook to describe BuzzData given I am not a fan of them) - sorry BuzzData please don’t hate me for that analogy but I have to get my point across. One Love.
I will profess that I am not much of a baseball expert. My heyday of watching baseball was during my punk teen years rallying behind Canada’s then beloved Toronto Blue Jay’s who managed for the first time to bring the World Series title north of the Bars & Stars for two years in a row in 92 & 93 - Back to Back baby! It was an impressive feat and I will never forget the dinger out in left field by Joe Carter that sealed the deal for the back to back title.
One of the original things I looked at from the dataset mentioned was the highest paid players for each year and to my surprise there was only one year where the team with the highest paid player won the world series, and that was the New York Yankee’s in 2009 with A Rod (Alex Rodriguez). I created a simple dataset to outline & confirm this fact because I was in disbelief.
The original dataset sparked a healthy debate and some interesting visualizations from people like Artful Geek and Gary Storey. From there I wanted to dig deeper and analyze some more numbers by creating a dataset with the team payrolls from 1988 -2010 (Keep in mind there was no World Series in 94 due to a strike).
Over the 22 year period the World Series was won 7 times by the team with highest payroll but only two teams achieved this. The Toronto Blue Jays had the highest paid teams in their back to back titles in 92 & 93, and the payroll king New York Yankee’s won with the highest paid team in 96, 98, 99, 2000 & 2009. If my math is correct the highest payroll during this period constitutes a 33% World Series title win rate. The average winning payroll ranking sits at around 7 but if you look above, no 7th ranked payroll team has won, however there have been three 8th placed teams, and just last year the 9th placed San Francisco Giants earned the crown. This graph sheds light on a newer trend as well. During the last decade, the highest payroll team has only won once or 10% (Yankees 2009) and the top 2 team won twice ( 20%: Red Sox 04 & 07). What gets really interesting is that the other 7 titles (70%) during the last decade have went to teams ranked 8th or higher in payroll (as high as 24th: 2003 Florida Marlins) with the ten year average sitting at 9.6. Based on this statistic alone, the 2011 payroll stats has the Detroit Tigers ranked 10th. Will they be your World Series champions for 2011? I have no idea but I would love to get more people engaged on BuzzData to join these datasets and discussion.
I looked at some of the most influential and followed “Baseball & MLB” twitter users and it would be great to get them and others dialed in:
Alyssa_Milano (Yeah that Alyssa Milano. She’s a smart cookie & loves baseball)
Send these people and other baseball nuts a tweet and tell them about this awesome site.
I see @BuzzData for the serious, leisurely, and humorous alike. It is a repository to store, foster, and grow ideas from data, and I believe such a platform is necessary for us to advance democracy, our environment, ideals, and at the core – our future! Imagine more governments offering up data to the public in pursuit of better solutions, a business looking for help in solving complex problems, an organization looking at ways to improve the environment. It is happening and BuzzData is setting out to further unleash this open data frenzy.
The use case for sports will be interesting as well. I believe the networks and gaming companies that truly understand this will harness platforms like these to propel their businesses. Gamification and social engagement around data has huge potential.