Wednesday, May 21, 2008

Methods and Statistics Explained

Baseball Playoffs Now's Methods and Statistics series explains how we get the results we do - and especially why our mathematical interpretations differ from public stats posted on sites like ESPN's MLB standings or stats pages.

Part I: History

History - or the concept of "fading memory" in stats - is used in almost every application on this site. Baseball Playoffs Now firmly believes that August play is more important than April play when predicting playoff contenders or outcomes. Most commentators deal with a team's Last 10 Games because that is an easy statistic to track, and we offer standings over that same period as a comparative value. However, unlike those commentators, we do not see a team's record broken into two categories: Last 10 Games (most important) and Ancient History (least important).

Baseball Playoffs Now has created a unique algorithm to rate each game in the season with a historical value, a coefficient which is included in most every formula we use. This historical coefficient is quite simple: if today's game were the Brewers' 11th game of the season, then game 6 - the game directly in the middle of the season so far - would have a value of 1.0 (that is, no change whatsoever). Game 1 would have a value around 0.97 and Game 11 would have a value around 1.03. These bottom and top values (0.97 and 1.03) begin to separate as more games are played.

After yesterday's games, the season so far for all teams has lasted 687 games, an average of 45.8 games per team. That means that Game 1 of this season is rated 85.8% as important as the middle game and Game 46 of this season (from yesterday) is rated 114.2% as important as the middle game.

Now, history becomes important when we look at statistics like runs scored and runs allowed. The Yankees have scored 181 runs and allowed 209 runs to score this season. But multiplying each of their game scores by that game's history value allows us to see the Yankee trend: their historically-weighted run ratio is 178.6 runs scored to 209.2 runs allowed. The interpretation is clear: New York has been doing worse recently than if you looked at their season overall. And this realization that New York is on a slide must be taken into account when making predictions and rankings.

Our Last 10 Games stat also bears the marks of history; we do not weight the last 10 games equally but rather give them their normal history weight on the same sliding scale with every other game. The absolute best way to look at a team's recent play is to look at the season as a whole, with every game weighted for its historical importance. But in order to make comparisons with the well-known Last 10 stat, we also offer a look at a team's season cut off at 10 games and rank the teams on that measure as well.

So if you're checking our run ratios and wondering why we just quoted Atlanta at 1.310 runs scored for every 1 allowed (when their 217 runs scored and 166 runs allowed clearly shows a ratio of 1.307), it means that Atlanta's excellent recent play has bumped up their weighted run ratio by a slight amount. It doesn't sound like a lot, but the difference between the #4 Red Sox at 1.195 and #5 White Sox at 1.190 is only 0.005 runs. Statistics calculated over many low-scoring games show very small differences between teams; this is why a value for history is so important, because it clearly illuminates the better team (by one measure) between nominally equal clubs.

No comments:

Post a Comment