Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Shootings Near Thunder's Arena Follow Win Over Lakers,

A Look at Shot Quality

A Look at Shot Quality

I wanted to take a crack at shot quality. So I looked at shots recorded by ESPN's GameCast system for the 2009-10 and 2010-11 regular seasons. Only shots and goals were considered (so missed shots were excluded). Penalty shots and empty net shots were also excluded. There were 74300 shots analyzed from 2009-10 and 73774 shots analyzed from 2010-11.

 

In these data, for each shot we know the team taking the shot, the team and the goalie defending the shot, the strength of the team taking the shots (the values here are Even, Powerplay, Shorthanded), the type of shot (slap, snap, wrist, wraparound, deflection, backhand, tip-in), the recorded x-coordinate, and the recorded y-coordinate. As Ken Krzywicki has done link and link, I mapped all shots to a single offensive zone.

 

 

Star-divide

 

 

Adjustments were made for shots taken at Madison Square Garden. I did this by mapping the cumulative shot distribution of the originally recorded x-coordinates at MSG to the cumulative shot distribution of all shots not taken at MSG. (Formally it a probability integral transform but nobody knows what I say when I write that.) I did the same for the y-coordinates.

 

I then followed the same methodology for getting at the average shot quality as I did for the Defense Independent Goalie Rating (DIGR) that I developed previously. Here is the link to that paper:

DIGR Paper

and to a media article with some critiques of that paper':

Article

 

The basic idea is that I fit a nonparametric smooth surface to the ice with the response being a probability of a goal like the graph below.  (This particular graph is for Martin Brodeur for slap shots taken at even strength during the 2009-10 regular season.)  Red indicates high probability of a goal, while blue indicates lower probability of a goal.    

Shotprobmap_medium
 I fit such a model for each type of shot and for each shooting team strength, so that there are 21 total maps/models. In the DIGR paper I made a map for each goalie and then predicted from those maps the expected save percentage if each goalie faced every shot in the league for a given season. Here I created a map for the league (based on all of the shots in a given year) and then predicted the probability of a goal for a given teams set of shots. There is a nice mathematical framework in the DIGR paper that justifies how to do this. The nonparametric part is that our probability model does not assume a particular form (i.e. linear, log linear or quadratic) for the relationship between probability of a goal and the x and y-coordinates.  The data determines the form of the relationship.  Having fit each teams shots to our model, I average these predicted values to get an average predicted shot probability or shot quality against. Call this SQA.

 

I calculated the SQA for both the 2009-10 and 2010-11 seasons with a league average mapping that was different for each year. The following graph shows these values for all 30 teams for shots taken at even strength. It is clear from this graph that there is a relationship from year to year between these two SQA at the team level. The correlation here is 0.75.

 

Shotqual09vshotqual10-even_medium .

The correlation, 0.75, is pretty high.  Though it might be hard to read, each teams location is plotted with an abbreviation for that team.  Minnesota(min) sits at the upper right of this graph and Chicago(chi) actually improved over last year. Tampa Bay(tam) was also a big mover here. Some of these teams are not surprising: New Jersey (njd) and Boston (bos) on the high end, the Islanders (nyi) and St. Louis (stl) on the lower end.  Others are surprising: Dallas (dal) and Calgary (cgy).  This correlation does seem high to me.  So there are two possible problem areas: a problem with the analysis or a problem with the data.

 

Let's start with the analysis.  I went back and looked at Ken Krzywicki's model from his analysis of SQA for the 2009-10 season. I was able to reconstruct some of the additional variables that Ken used by creating angle values and distance values from the x- and y- coordinates. I don't have whether or not a shot was a rebound in my data. This is a weakness but I was able to fit the regression without the rebound and without the indicator of whether or not the shot came after a giveaway by the opposing team. So my logistic model was not identical to Ken's but it was of the same basic form and it had 4 of the 6 predictors that Ken's model had. My logistic model had distance, absolute value of angle, strength of team performing shot and shot type as predictors. Having fit that model, I did the same thing as I had done for the nonparametric one: I predicted the probability of a goal for each teams shots based upon a league average model of shot probability.  Below is a graph that shows the SQA(logit) for each team for the two seasons in question. There are some changes in how the teams perform between the SQA and the SQA(logit). That is to be expected as we have different models. (Personally, I think that there is a strong case for the SQA be a superior model to the SQA (logit) and when I get a chance this evaluation can be done empirically, e.g. by looking at residual deviance. )

 

Shotqual09v10logit_medium

 

The correlation between SQA(logit) for the 2009-10 and the 2010-11 regular seasons was 0.82.  For many of the teams they are in roughly the same region of our graphs as they were for the SQA.  

 

So it is not the model.  It could be the data.

 

Well, ESPN GameCast totals and NHL Play-by-Play totals are different and the NHL ones are to be taken as the ground truth, no doubt. So some of the shots could be fishy. Previously I had looked at distributions of x- and y-coordinates for the 2009-10 season and only found that the MSG games were anamolous, hence the MSG correction mentioned above. That was not it. I looked at counts of shots faced by team. For 2009-10, the teams facing the fewest shots per game were the Blackhawks, Devils and Kings. In the GameCast data it was the Blackhawks, Devils and Kings. Staying with that same season the teams facing the most shots were Florida, Edmonton and Anaheim in both the GameCast data and the NHL data. Similar results held for the 2010-11 season. The number of shots faced per team was reasonable.  I've previously done other quality control checks on both of these years worth of data when I was getting them into the proper format for analysis in the DIGR paper. The data are accurate in the sense that they represent the values that the ESPN GameCast presented.  

 

So what are some of the other possibilities.

 

One. This was a fluke. The relationship between the shot quality measures for 2009-10 and 2010-11 are anomalous. An outlier.

 

Two. The metrics that we are using for shot quality are flawed. No doubt they are imperfect.  (Since they are expected values essentially, they are collapsing distributions into single values.  At some juncture looking at the shot intensity maps would be a good idea.)  There are possible predictors that are missing such as speed of a shot or whether or not the goalie was screened or the game score differential. This analysis looks at only Even, Powerplay and Shorthanded but not at the specifics of 5v4 instead of 5v3. That is a limitation.

 

Three. The data is correct in the files but not accurate. We know that there are issues with the humans recording the shot. The NHL produces highly imperfect data. The x- and y- coordinates are no doubt approximate at best.

 

These are three pretty important concerns, but yet the results suggest a moderate to strong correlation, r=0.75, for SQA from one-year to the next.  That does not suggest a fluke.  But this is testable.  (Some time in the future I'll get the 08-09 data into the proper format and add it to this analysis.)  Certainly both SQA metrics are not without there flaws.  But I think that are generally good.  AND they come to roughly the same conclusion.  Likewise with the data.  It is incomplete.  There is measurement error.  But it is the best we currently have.

 

In the end, based upon this analysis, I think that shot quality can be impacted by a team.  I'll leave its prediction for another day.  But is that impact meaningful?  Well we can say that about 56% (r^2) of the variability in even strength 2010-11 team SQA can be explained by knowing the 2009-10 even strength SQA.  The difference in even strength SQA between the top team and the bottom team for both years is about 1.5%.  Over about 2000 shots, that's about the number of even strength shots that a team faces, the difference from top to bottom is about 30 goals over a season.  That's 10 points in the standings.  That would seem meaningful.            

 

Postscript:

 

Some other correlations that I found:

Between SQA Even Strength and Raw Save Pct; 2009-10(r=0.11), 2010-11(r=0.12).

 

Between SQA Even Strength and SQA Powerplay: 2009-10(r=0.46), 2010-11(r=0.49).

 

Between SQA Even Strength and Total Shots Faced: 2009-10 (r= -0.07), 2010-11 (r= -0.24).

 

Between Total Shots Faced: 2009-10 and Total Shots Faced 2010-11 (r=0.35)

 

 

Comment 15 comments  |  1 recs  | 

Do you like this story?

Comments

Display:

Interesting work

Is there a particular reason why you grab ESPN’s x & y data for shots rather than the NHL’s?

Managing Editor of On the Forecheck, SB Nation's blog covering the Nashville Predators, and HockeyGearHQ, a site devoted to news, reviews, and deals on hockey equipment and accessories. Catch me on Twitter, or join OTF on Facebook!

by Dirk Hoag on Jun 16, 2011 10:19 AM EDT reply actions  

It’s easy to grab the x & y’s from the ESPN site since they are in xml. I’m pretty sure that that is also what Ken K does.

by schuckers on Jun 16, 2011 12:09 PM EDT up reply actions  

Care to point me in the right direction? I’m combing through the source of a game page but don’t see any appropriate XML references. If you’re willing and would prefer to share that info privately, email me at the.forechecker@gmail.com.

Managing Editor of On the Forecheck, SB Nation's blog covering the Nashville Predators, and HockeyGearHQ, a site devoted to news, reviews, and deals on hockey equipment and accessories. Catch me on Twitter, or join OTF on Facebook!

by Dirk Hoag on Jun 16, 2011 1:46 PM EDT up reply actions  

Great job

"Though we do run the risk of one of the pucks generating human-like emotions, and yearning for a better life outside of its cold, violent existence…" -Ben

by ThrashersRecaps on Jun 16, 2011 11:00 AM EDT reply actions  

Very interesting stuff

One thing I would also look at is which teams made major improvements over that one year. I’m especially looking at the first graph for that.

Tampa Bay, Los Angeles and Calgary seem to have taken major steps.

For Tampa, with the coaching, managements and players changes that they have made, it would seem possible that their new “system” gives less quality chances than the last one. On the other hand, is it possible that because they have gotten better results this year, the graph looks like that?

Another thing I was wondering, is it possible to run Shot Quality For?

by Simon Lamarche on Jun 16, 2011 12:45 PM EDT reply actions  

Good stuff here, but that inter-year correlation is very large, given the nature of statistic with which we’re dealing.

Recording bias is the obvious explanation – I’d take a second look. The scorers in MIN, T.B and N.J have overestimated shot distance in the past, and all three of those teams do well with respect to SQA.

I’d be interested in knowing if the inter-year correlation for Road SQA is lower than
for Home SQA.

by JLikens on Jun 16, 2011 7:20 PM EDT reply actions  

home v away

Okay so I kept the same league average model (i.e. I didn’t make separate models for home and away) and I calculated SQA for home and away for both years, i.e shots faced by a team at home and on the road. Correlation for away drops to r=0.59 while for home it is r=0.73. The graph has the new away SQA’s.

The results are certainly less than in my original post but it does still, I think, suggest that teams have some ability to control average shot difficulty.

by schuckers on Jun 17, 2011 10:48 AM EDT up reply actions  

Thanks for running the numbers. Just to clarify – this is for overall SQA, and not just for even strength?

The correlation is still fairly strong, especially considering the sample has been halved in each year. I can’t think of any explanation other than the one you propose (i.e. SQA being repeatable).

by JLikens on Jun 17, 2011 2:41 PM EDT up reply actions  

Just even strength. Sorry that is not clear.

by schuckers on Jun 17, 2011 4:17 PM EDT up reply actions  

i would agree. very good work.

by Ahmad Bradshaw on Jun 17, 2011 5:55 PM EDT up reply actions  

The range just got a lot tighter.

I can’t tell the details on SQA on this, but it looks like the lowest for 09-10 is about .917. (Carolina). In the above chart, it looks like the lowest for 09-10 was about .910 (Chicago).

Initially you had mentioned that the variation would lead to approximately 10 points over a season between top and bottom . Now that the range has gotten smaller and the predictive value has decreased as well, what is the corresponding difference in points between best and worst?

by Bourque77 on Jun 17, 2011 3:33 PM EDT up reply actions  

Sorry – the .910 is from the charts in the article and the .917 is from the one in the comments. That may not have been clear in my initial post.

by Bourque77 on Jun 17, 2011 3:35 PM EDT up reply actions  

The range did get tighter so it mean that the difference is probably about 8 to 9 points now.

by schuckers on Jun 21, 2011 2:04 PM EDT up reply actions  

Wow, you did a lot of leg work, and I applaud your effort. I think your research is quite a stepping stone to some important points. But it also brings up some questions…
How do you rectify this with Sh% regression to the mean? Also, have you run inter-season correlation? I think this may be even more powerful/convincing. Really great work overall though. Love to hear what Gabe/ Derek say about this.

by SnarkSD on Jun 17, 2011 12:21 AM EDT reply actions  

yes, outstanding work. very impressed

by Ahmad Bradshaw on Jun 17, 2011 5:56 PM EDT up reply actions  

Comments For This Post Are Closed


User Tools

The finest Winnipeg Jets analysis on the internets

FanPosts


Managers

Hawerchuk_small Hawerchuk

Gary_bettman_bad_dreams_small Bettman's Nightmare

Grapes_small canadian texan

Howe_small TJCAPS

Editors

Ryan_small SO_RyanP

0_small maplestirup

Jets2_small arby_18