When shot quality was first analyzed several years ago by Alan Ryder and Ken Krzywicki, it was hailed as a big breakthrough in the numerical analysis of hockey. Sadly, biases in the way the NHL records shot distance caused the data to be less reliable than it should have been, and we’ve even discovered that there may be bias in the recording of shot totals (Objective NHL had a great series on this here, here and here, see also my analysis here. In recent years, we have come full circle: Corsi numbers have become a popular way of rating players, and many members of the Corsi crowd have called into question the very existence and validity of shot quality. I think it is time for this to stop. Shot quality exists, it is measurable and it is verifiable. Now it's time to convince you of this.
The factors affecting shot quality
Let’s be clear on one thing: nobody denies that “individual” shot quality, meaning the level of danger of an individual shot, exists. In Ken Krzywicki’s shot quality model, he identified 5 factors that were statistically significant that affected a shot’s level of danger: the distance from which the shot was taken, whether or not the shot was a rebound (defined as a shot taken at less than 25 feet within 2 seconds of a previous shot), the manpower situation (even-strength, power-play, short-handed), whether the shot was taken immediately after a turnover and the shot type. Of these 5 factors, the first 3 were very important, while the last two were marginal.
One of the big recent discoveries has been that a fourth factor, the game score, has a significant effect on shooting percentages (see here and here, which is what shot quality tries to measure.
Most of what I have described here is the accepted wisdom of the puckmetric community. Given that the power-play skews percentages, the point of debate is even-strength percentages. The argument is: “Even-strength shooting percentage varies randomly from team to team”, or the even more narrow variant “Even-strength shooting percentage with the score tied varies randomly from team to team”. I disagree.
Is aggregate shot quality significant?
To observe the magnitude of shot quality, I have constructed a shot quality model that estimates the probability of each shot going in, based on the same factors of Krzywicki’s model plus the game score. This then allowed me to calculate, for the 90 team-seasons of the last 3 years, the expected goal differential based on shot differential and the expected goal differential based on shot quality. Because I am only calculating shot quality differential, arena biases should not have a large impact on my result.
Aggregate numbers for my sample were:
- An average of 1785 even-strength shots for/against per team
- Average shooting percentage per shot of 8.0% with standard deviation of 7%
This means that, simply by luck, we would expect the standard deviation of expected goals of a team due to shot quality to be sqrt(1785 * 2) * 7% = 4.2 goals, and we would expect the luckiest of the teams in our sample to have somewhere in the order of +11 goals (almost 3 standard deviations), and the unluckiest -11 goals. It turns out to be a lot more than that:
Best shot quality teams 5-on-5
Shot Shot Goal
Team Year GF GA expGF expGA SF SA Volume Quality Differential
Penguins 2008 142 122 146 142 1580 1812 -19 23 20
Avalanche 2010 154 146 148 153 1722 1977 -20 16 8
Sabres 2009 152 144 145 131 1838 1837 0 14 8
Sharks 2008 131 120 141 102 1734 1403 26 12 11
Avalanche 2008 156 139 150 134 1742 1680 5 11 17
Nobody who has looked at shot volume stats should be surprised to find the 2008 Pittsburgh Penguins on this list. In the colorful language of JLikens, the Penguins were getting “murdered” on Corsi, yet still managed to outscore their opponents by 20 goals at even-strength. This would have been quite a feat had they been taking similar quality shots as their opponents, since they scored 33% more goals per shot than their opponents and beat their expectations by 39 goals. Adding shot quality into the equation, things become clearer: they only beat expectations by 16 goals.
The second team on this list is probably the most interesting for current readers, since it involves the most controversial team of the season, the Colorado Avalanche. Much has been written about the Avalanche and their disastrous shot differentials (for the most insightful of these articles, read Gabriel Desjardins. But when you take shot quality into account, suddenly the Avalanche look like an average team, rather than the insanely lucky pretenders they have been masquerading as all season.
What about the other end of the spectrum?
Worst shot quality teams 5-on-5
Shot Shot Goal
Team Year GF GA expGF expGA SF SA Volume Quality Differential
Hurricanes 2009 135 137 152 149 1964 1767 16 -13 -2
Maple Leafs 2008 147 154 135 136 1843 1686 13 -13 -7
Red Wings 2008 147 103 139 107 1958 1380 46 -14 44
Thrashers 2010 161 168 144 174 1795 1994 -16 -14 -7
Red Wings 2009 167 141 150 134 2089 1621 37 -21 26
The Detroit Red Wings? Blasphemy! In fact, this matches the anecdotal evidence on Detroit, that they have traditionally generated high shot totals with many shots by defensemen, using the first shot as a spark to generate traffic, rebounds and further scoring chances.
Overall, the variance of shot quality is roughly 3 times what would be expected purely from luck, which means that the shot quality effect is easily statistically significant. As an explanatory variable, it is 4 to 5 times less important than shot differential. I’m not saying that shot quality is the be-all-and-end-all. For outlier teams that excel or are terrible at it, it is important to take into account. For most teams, it can be neglected. But to neglect it broadly is to fail in our fundamental task to understand and explain hockey.
If shot quality is so significant, why did so many intelligent people who were looking for it fail in finding it? I can see three main reasons: the first is analysis technique. For example, I could very well have taken my shot quality metric, correlated it to goal differential, and found that the correlation coefficient is -0.004, while the correlation coefficient of shot differential is a cool 0.500. Thus, I could have concluded triumphantly that “Shot quality is a myth and has no correlation with observed goal differential.” I could also have summed shot differential and shot quality and correlated THAT with goal differential, getting 0.515, and concluded “Shot quality has only a marginal explanatory power over and above shot differential”. I could also have done a number of other analyses, each with their own conclusions. None of these techniques are wrong per se; they just hide the explanatory variable in a cloud of noise, such that the results are inconclusive.
The second problem is that shot quality correlates negatively with shot volume. We can see this easily in our leaderboard: the best shot quality teams (2008 Penguins, 2010 Avalanche) tend to get outshot significantly, while the worst shot quality teams (every team in the bottom 5 except 2010 Atlanta) are all shot-positive. This is why shot quality alone has no strong correlation with goal differential: the Avalanche’s shot quality advantage doesn’t even bring them back to break-even in terms of expected goals!
The last frequently encountered problem is sample size. Ever since the discovery of the leading/trailing bias, it has become popular to analyze data while tied at even-strength. NHL games are tied roughly 1/3 of the time, so this means that over an entire season, an average team will have 600 shots for, 600 shots against, 48 goals for and 48 goals against while tied at even-strength. This is a minuscule amount of data, and any analysis you perform over that time period will yield the same result: random. It will be worse yet if you restrict yourself to the road, to eliminate arena bias. No wonder no effect other than shot differential can be teased out of the data!
Is it sustainable?
The last question that often comes up when we observe an effect like this: is it sustainable? The most obvious way I found to answer this question was to correlate year-over-year shot quality differential team-per-team. Ideally, I would have compared even and odd games, but I didn’t have the data, and besides it would have cut my sample size in two. In some cases, teams change strategies or personnel, as was the case for the aforementioned Penguins who completely changed their approach when Dan Bylsma became coach and are now the NHL’s #2 shot differential team. For the most part, however, comparing year-on-year is valid and if there is any sustainability it should show up there.
The good news: the year-over-year correlation coefficient is 0.298, which means that shot quality will regress 70% year-over-year. The actual sustainability is a bit higher, for all the reasons mentioned above, but this is a good ballpark. This value is statistically significant, so sustainability does exist, as it should if my above analysis (that shot quality is not all luck but is at least in part due to skill) is true. This conclusion is not surprising, given the earlier results we have seen on how shooting percentages are affected by the game score. If teams can modify their strategies in response to the scoreboard, why can they not modify them in response to other factors, the most important one being the skill set of their own players? We already see this on individual players: while a good shooter like Ilya Kovalchuk will take shots from anywhere, confident that he can beat the goalie, a poor shooter like Ryan Smyth will wait until he is in point-blank range to shoot.
The quick conclusion? Don’t expect the Avalanche’s percentages to return completely to league average next season, although obviously given the variance they could fully regress or even undershoot. Shall the betting begin?
Tom Awad is an author of Hockey Prospectus.
You can contact Tom by clicking here or click here to see Tom's other articles.