I always stress taking the long term view on the season. 162 games is a long way, and getting caught up in little things like getting swept at this stage shouldn’t mean much. After all, we just have to win one more game than the Diamondbacks every two months in order to make up the difference, doesn’t seem like much of a task, does it? But when I got to thinking about it, there’s a really good chance that the Dodgers and Diamondbacks fight each other all the way down the stretch, and it seems very likely we will finish within three games of them. Looking at it that way, this sweep can make a huge difference. The question is, how huge?
PECOTA came into this season ranking the Dodgers and Diamondbacks as equals. Both teams would finish with 87 wins and we would have a one game playoff to determine who took the NL West. Because of this, the Dodgers and Diamondbacks came into the series with near equal chances to make the playoffs. Arizona had about half a percentage point lead, most likely because they haven’t got to play the Giants yet. After taking three games though, the half a percentage point difference suddenly became a five percent difference in both directions. The sweep means the Diamondbacks are now 10 percent more likely than the Dodgers to make the playoffs. Now, the important question is does the playoff odds report mean anything at this point in the season? First let Clay Davenport describe how they’re calculated.
As the title says, the post-season odds report was compiled by running a Monte Carlo simulation of the rest of the season one million times. Current wins, losses and expected winning percentages are taken from the AdjustedStandings Report. Expected winning percentages (EWP) for each team starts with their W3 and L3 from the Adjusted Standings. A regression is applied to derive the EWP for the rest of the season, which is going to be between the current winning percentage and .500. To allow for uncertainty in the EWP, a normal distribution centered on the EWP is randomly sampled, and that value is used for the remainder of the season in that iteration. To simulate the normal 4% home-field advantage, the home team gets a .020 point bonus, while the visitors take a 0.020 penalty. The likelihood of winning each game is determined by the log5 method.
At least, that's how it works in the regular version. For this, PECOTA adjusted version, the regression isn't done to the mean (.500), but rather to the PECOTA projection made at the beginning of the year (numbers which can be found here). The Yankees were projected to be a .580 team - and so the EWP described above will nudge their record towards .580, not .500.
More succinctly, at this point in the season, the system basically assumes that each team will perform at their PECOTA projected winning percentage, then simulates the season one million times to generate the odds. The best interpretation of this data is saying is that if you take two evenly matched teams, and give one team a three game head start with 153 games left, the team with the head start will make the playoffs 10% more often.
Now, there’s lots of people that are willing to throw this data away for several very valid reasons. PECOTA projections aren’t gospel, they can’t predict injuries, sudden breakouts and collapses, and distribution of playtime. All of these are very true, and the 10 percent number isn’t the important part of this. Whether or not you agree with PECOTA, I think the majority of people will agree that the Dodgers and the Diamondbacks are very evenly matched teams. The important part is that when two teams are very likely to fight tooth and nail through the entire season, every game counts. I've complained about the length of the season in the past, saying that these games just don't mean much, but as the numbers show, they certainly do. These games mean just as much as the ones in September, and simply winning an extra game every two months over an equally talented team isn’t quite as easy as it sounds.