Winning March Means Nothing In October: Comparison of Spring Training and Regular Season MLB Performance

It's March, which means the weather will even more Indiana-y (1), the variety of seafood specials at restaurants will increase, and baseball will be in Spring Training. MLB's preseason, in which the 30 teams play around 30 games each (2), is longer than the 4 game NFL preseason or 8 game NBA and NHL preseasons. Which begs the question: since the sample size is larger, might it be more predictive than the NFL's preseason? Sparked partially by the following Facebook post by my brother-in-law, I decided to determine if that 7-0 record is a sign of optimism for the upcoming season (if you are an Angels fan).

Using data from ESPN of the 2003 to 2016 Spring Training (3), and Lahman’s Baseball Database (4) for the 2003-2016 regular seasons, I compared the Pythagorean Expectations of each team's Spring Training and Regular Season performance. By using the Pythagorean Expectation, a better sense of a team's quality is determined than the win-loss record, with less variability due to chance.

The results confirmed the conventional wisdom of Spring Training not being predictive of the Regular Season. The correlation coefficient of the Pythagorean Expectation between the regular season and Spring Training is .2272, indicating the correlation is weak. For example, the 2016 Chicago Cubs were under .500 for Pythagorean Expectation (48.53%) in Spring Training and had a Regular Season Pythagorean Expectation of 67.87%; this would rank 7th since integration (the 1947 Season), compared to a quite middling Spring Training. While there are teams that were great in both or terrible at both, examples of mixed records abound as well. Some of this is due to the fact primarily players tend to be limited in Spring Training to prevent injury, giving the lineups a difference look than the regular season, and some is due to luck. So it would be unwise to put much stock in Spring Training performance; what happens in March stays in March, and everyone is 0-0 come April.

Similar to the NFL analysis, there was a strong correlation (in this case, -.7267) between a Spring Training Pythagorean Expectation and the difference between Spring Training and Regular Season Pythagorean Expectation. This is not too surprising; outliers (high or low) in Pythagorean Expectation are unusual in a league where most teams are relatively close to 50%, and as sample size increases there should be some progression to the mean.

What was surprising in the data was the comparison between the results from the NFL preseason and MLB preseason. Despite more games in Spring Training, the NFL's preseason was more predictive, with a correlation coefficient of .2618 compared to MLB's .2272; the coefficients of the preseason and the difference between the preseason and regular season was -.6843 for the NFL and -.7267 for the MLB. There are two factors that likely contribute to this. The first is despite the fact there are more games in Spring Training than the NFL preseason, they represent a small amount of games relative to the length of the regular season; 25% for the NFL compared to 18.41% for MLB. Given the nature of greater rotation among key baseball players than in football (pitchers rotate; QBs do not), that factor has an even greater impact in producing small samples. Another factor is in the NFL, the separation of team performance has been greater; there was a standard deviation of Pythagorean Expectations of 16.23% in the NFL compared to 6.882% for the MLB. Mostly driven by the longer season causing grouping of performance around the mean, teams tend to be less exceptional; the 2016 Chicago Cub's Pythagorean Expectation was 7th in the MLB since 1947, but would only rank as 47th between 2006 and 2015 in the NFL. This grouping around the center means that the difference between great and mediocre teams tends to be smaller in the MLB than the NFL.

While Spring Training (and other preseasons) may not be predictive of the Regular Season, that doesn't mean they aren't important. It gives teams opportunities to evaluate talent, get players into shape, and gives managers a chance to experiment with lineups. Most importantly for baseball fans, it is baseball, and a sign Opening Day is coming soon.

Footnotes and Citations

(1) Indiana-y weather is characterized as containing extreme swings of temperature, wind, cloud cover, and/or precipitation over the course of a short period of time, including within a single day. There is also the potential for extreme events including violent storms, tornadoes, massive quantities of snow, and very low and very high temperatures.

(2) The number of Spring Training games per team calculated from data available from ESPN for 2003-2016 is 29.86

(3) http://www.espn.com/mlb/standings/_/group/league/season/2016/seasontype/pre

(4) http://www.seanlahman.com/baseball-archive/statistics/

Speak With Data

Search This Blog

Winning March Means Nothing In October: Comparison of Spring Training and Regular Season MLB Performance

Comments

Post a Comment

Popular posts from this blog

Brohm and Calhoun: Purdue's New Top Two Choices Analyzed

Disney Princesses Are Not All, In Fact, Princesses

Fiction v. Engineering: Part 1, The Wall, a Structural Analysis (From Game of Thrones and A Song of Ice and Fire)