There's been a lot of clamor over luck recently: what is it?; how does it manifest itself?; to what degree does it affect a game, a playoff series, a season?; how do we quantify it? Some of those questions are irrelevant, but others carry great meaning to even the most casual fan (or should, anyway). Some of those questions can't be answered directly using anything resembling this approach. Regardless, I hope to shed a definitive quantitative light on the basics of NHL seasonal point variance. Some people might not like my approach; if you are one of those people, I encourage you to follow the groundwork layed in this trio of articles by the brilliant Brian Burke at Advanced NFL Stats; he presents a different but equally valid approach to this that might be easier to swallow even if it's significantly more complicated.
My method is rather simple. I assume the closing overtime-(and SO)-included line at Pinnacle Sports is the most accurate representation of a team's chance at winning a hockey game. If you do not accept this, I cannot do much to help you believe it except urge you to put your money in play. Closing lines, even early in the season, are remarkably efficient predictors on a game-by-game basis (barring, I don't know, an early in-game injury or a literally game-time scratch of a high-GVT player?). Anyway, after converting the juiced lines to strict odds for all 1,230 games, I run a Monte Carlo simulation of the entire 2009-10 season 100,000 times. The frequency of three point games is modeled linearly with respect to the maximum winning percentage for each game (R^2 of .91 or thereabouts). The point distributions are collected, and as a bonus feature I also check division and conference position according to the 2009-10 tie-breaking procedures (goal differential is a little nastier and a lot more time-consuming to model, so any tie that came down to GD was decided at random).
There are a few caveats. The final mean does not represent true team strength by any stretch: teams suffer injuries, lineup changes, and especially differing schedules which will non-trivially affect the odds of winning any game. Therefore I don't think the means are comparable, especially across conferences. I really intended this to be a team-by-team study of luck to examine its wide-ranging effect on an NHL season. Also, my means add up to about 28 points less than there actually were last year (2761 to 2733). Take that as you will. Now for the fun.
The average standard deviation on points is 8.33. This means a team that averages 92 points a season, roughly a bubble team for the playoffs, will finish outside of the range of 87 (surely out) to 98 (surely in) over half the time. The 90% confidence interval extends from about 79 to 106. Although this next statement is pretty misleading, according to the means, that's roughly the difference between the best and worst teams in the NHL last year. Given that 30 teams compete in the NHL, we should expect to see that kind of variance, in both directions, every year. And, as hard as it might be for some to believe, we do. Our perception of team strength can be seriously marred by faulty data like wins, points. Washington wasn't 121 points good (although the President's Trophy was really no fluke, honest), Phoenix wasn't 107 points good, and Toronto wasn't 74 points bad, Edmonton not 62 points bad (they, quite remarkably, won their division almost 0.9% of the time in the simulations). Seasons like those could happily define variance in sports, and they account for an eighth of the NHL last year. While it is human to attribute (what I would call) metaphysical reasons to it, there's not much we can do from the sidelines but attribute luck to luck, variance to variance.
So here is a fun Excel spreadsheet for you to look at and play with. The "Prob" column refers to the probability the team would get their number of 2009-10 actual points or more according to the Monte Carlo simulations. Everything else (Mean, Median, Mode, Std Dev, Division Win %, Playoff %) is hopefully self-explanatory. To the right of the table there is a drop-down box for you to pick a team; below the table a graph will auto-populate with probability densities (I use a normal distribution because it's much easier than storing all the data from the simulations) for a range of points. Presented without comment (that's your job), here's our friends from Colorado to whet your appetite:
If anybody wants my full data set or is interested in similar applications, let me know.
Update: New Excel spreadsheet with larger sample size available here.