Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Preakness 2012: I'll Have Another Wins Again

Slight Statistical Improvement: Corsi Qualcomp

Sunny Mehta pointed out a very good change to the Quality of Competition algorithm. Instead of using relative +/- (which includes just goals when a player is on or off the ice), it would be better if Qualcomp used the total shot volume while players were on the ice. Why? We already know that shot differential (aka Corsi) is a better predictor of future goal differential than goal differential itself is. Corsi also includes a much higher number of events than simple +/- does - approximately 25x.

The relationship between Qualcomp and Corsi Qualcomp is still very strong (the overall R^2 = 0.54), indicating that we were on the right track initially. You can see the results here - Brent Seabrook and Duncan Keith are #1 and #2. I've only done this for 5v5 for the 2009-10 season; because of the genius way I wrote my database code, I have to make the same changes nine times in order to get this in to the entire site, which I'll do eventually.

I think of all the NHL teams, people out there pay closest attention to the Oilers matchups (even when the team is awful), so perhaps someone can give an opinion as to whether the Corsi Qualcomp is an improvement over regular Qualcomp:

Player CORSI QoC Rank Qualcomp Rank
SHAWN HORCOFF 1.092 1 0.017 1
SHELDON SOURAY 0.965 2 -0.009 5
GILBERT BRULE 0.763 3 -0.055 16
RYAN STONE 0.717 4 -0.024 8
PATRICK O'SULLIVAN 0.677 5 -0.011 7
STEVE STAIOS 0.600 6 -0.009 5
J-F JACQUES 0.502 7 -0.002 4
TOM GILBERT 0.447 8 0.002 3
DUSTIN PENNER 0.439 9 -0.045 14
ALES HEMSKY 0.306 10 0.007 2
LADISLAV SMID 0.281 11 -0.037 11
SAM GAGNER 0.273 12 -0.047 15
LUBOMIR VISNOVSKY 0.097 13 -0.072 18
RYAN POTULNY 0.039 14 -0.034 10
DENIS GREBESHKOV 0.022 15 -0.025 9
ROBERT NILSSON 0.017 16 -0.062 17
ANDREW COGLIANO -0.277 17 -0.040 12
JASON STRUDWICK -0.342 18 -0.109 20
ETHAN MOREAU -0.444 19 -0.041 13
ZACHERY STORTINI -0.807 20 -0.092 19

There are some rather significant differences - I'm interested to hear if this works out better.

Comment 33 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

For once (rarely!) I’m going to disagree with you. Corsi is a better measurement than +/- for individual players because it is less noisy, and you’ll get no argument from me on that one. However, many elite offensive players, in particular like the ones on your previous list of best shooters, tend to outplay their corsi. Look at the +/- and Corsi ratings for Ovechkin, Kovalchuk, Malkin, Thornton, etc. These guys have a +/- that is out of whack with their Corsi because they generate a lot of goals on the shots that are taken with them on the ice. Kovalchuk, for example, has a Rating of 0.92 but a CorsiRel of -5.1 this year, so Corsi QualComp would imply he is below-average competition. He’s an extreme example, but the whole point of QualComp is to see who plays against the first line. Maybe I’m wrong and the extreme cases don’t matter so much.

Also, the noise factor doesn’t matter much in QualComp. While a single player’s +/- is based on only 100 goals or so, QualComp by its very nature is a thin slice of hundreds of other players, so the noise factor is less relevant. Even if one of your opponents got lucky on his +/-, the impact on your overall QualComp is negligible.

by Tom Awad on Jan 15, 2010 10:07 AM EST reply actions  

“many elite offensive players, in particular like the ones on your previous list of best shooters, tend to outplay their corsi”

Define “many”, “elite”, and “outplay.” It sounds like you’re basically talking about a forward’s ability to influence on-ice shooting percentage. I’m not saying it’s not possible, but I’m skeptical about its overall effect and significance in the population. To my knowledge, no one has done a comprehensive study showing a large effect. And in fact, the few quick studies done show the opposite. I think Tyler did one where he showed that the top 25 or 50 players in Sh% in time x as a group regress almost completely back to the mean in time y.

Most of the times I’ve personally looked into it I’ve found the same.

Imo the point of all this is to figure out actual value. You bring up Kovalchuk, so let’s use him as an example. According to your GVT, he’s the 11th best forward in the league. I assume that’s in large part due to the number of goals he scores. But if we look beyond the counting numbers, here’s what I see…

ASSIGNMENTS: I don’t watch a ton of Thrasher games, so perhaps someone else can comment about how Kovy is specifically used. I assume he captures the attention of other teams’ top defensemen, and Gabe’s CorsiQoC has him third on the Thrashers. But he’s way down the list in terms of all teams. A quick look at the TOI matchups show that in his last home game against the Rangers, he did not play against Gaborik. The home game against the Caps before that he did play a bit against Ovechkin, though the Thrash got blown out 8-1 so we may have to take any of those stats with a grain of salt. The road game before that against the Pens he was matched against Jordan Staal and co, though I assume that was Bylsma’s doing. Nevertheless, let’s assume he has typical “first line” matchups.

TERRITORIAL: Kovalchuk’s on-ice Corsi/60 checks in at -5.84, good for 8th amongst forwards on his own team. Take from it what you will, to me it seems to signal that, whatever his matchups are, he’s spending a ton more time in the bad end of the ice than in the good. Further, he’s 9th on his team amongst forwards in terms of defensive zone faceoffs, so it’s not like he’s being forced to start a lot of his shifts in the tough end of the rink.

SCORING: Most would probably say “who cares though, the guy scores goals, and that’s what matters!”. And by the counting numbers, indeed, Kovy leads his team in Goals/60, Primary Assists/60, and Points/60 – and by a fairly wide margin. He’s also 5th in the NHL in Pts/60. But explain to me then how it’s also possible that such an “elite” player can also have a Goals Against/60 that’s third worst on his team and a plus-minus/60 that’s 126th in the league (!) amongst forwards.

I don’t know, man. Imo all signs point to Ilya Kovalcuk being pretty heavily overrated. If I were an NHL GM I certainly wouldn’t value him as “elite”, let alone the 11th best forward in the league. You could make a case for him not having good teammates, but we don’t really see this effect with other elite players on bad teams throughout the league (heck we don’t even see it with some of his teammates), and this is not something confined to one season either.

There is no catch-all stat, but imo shot attempts are still the bread and butter of what we have now (keeping in mind assignments, of course). Ilya Kovulchuk is the outlier of all outliers in terms of shooting percentage, and even then it’s not enough to make him more than a perhaps-slightly-above-average forward.

by sunnymehta.com on Jan 15, 2010 4:46 PM EST up reply actions  

On-ice shooting percentage regresses about 75% to the mean for players who switch teams year-to-year, so there’s some skill at work there (Tyler found a very little bit of it in shooting percentage). I think Corsi captures 50-70% of what we’re trying to figure out, so it’s a big improvement on relative +/-.

by Hawerchuk on Jan 15, 2010 5:36 PM EST up reply actions  

yeah I guess I was (mistakenly?) thinking that individual shooting percentage would correlate highly with on-ice shooting percentage

by sunnymehta.com on Jan 16, 2010 4:11 PM EST up reply actions  

Hi Sunny,

Just to clear up, I mostly agree with you that Kovalchuk is overrated. My point was simply that he’s surely not below-average competition, which Corsi would imply.

As for who can outplay their Corsi, here is a list of the players whose goals for on ice have most exceeded their shots on ice, over the last 3 seasons (on the defensive side, it’s mostly a goalie issue):
Heatley +56
Ovechkin +50
Kovalchuk +44
Backstrom (WSH) +42
Datsyuk +42
Iginla +38
Arnott +38
Getzlaf +37

then Perry, Spezza, Robidas (playing in front of Ribeiro), Mike Green, Alfredsson, Tobias Enstrom (Kovalchuk).

It’s not an exhaustive study, but the names don’t quite seem to be pulled out of a hat, do they? I’m not saying this is true for all players. For 95% of players, Corsi is a better measure than +/-. I don’t know if the benefit of “better” characterizing the elite outweights the benefit of better characterizing the masses, but it should at least be considered.

by Tom Awad on Jan 15, 2010 10:26 PM EST up reply actions  

There are also plenty of times when +/- doesn’t do nearly as good a job as Corsi even for very good players. For the Oilers, Shawn Horcoff is rated as one of the easiest forwards on the team to play against and that’s likely just not true. I think people just need to be aware that QC and especially QT are more estimates than they are gospel.

by Scott Reynolds on Jan 16, 2010 12:57 PM EST up reply actions  

Also, I think there’s a bug in your algorithm. The Blackhawks in general seem to all have very high Corsi QualComp, which as far as I know is impossible, and in fact the league average is far above. 0. Team and league averages are supposed to be close to 0, right?

by Tom Awad on Jan 15, 2010 10:13 AM EST reply actions  

The Hawks have 11 guys above zero, one guy at zero, and 6 guys below. That doesn’t seem like it’s out-of-whack for half a season. If you just use +/- Qualcomp, you get much weirder results – an entire team will have negative Qualcomp over the course of a season.

Your point about Kovalchuk et al is a good one. Part of that could be resolved with some shot location information, but I don’t know how to capture talent that allows a player to out-perform his Corsi. Maybe scoring chances +/- is a better metric.

by Hawerchuk on Jan 15, 2010 11:12 AM EST up reply actions  

Great improvement. I agree, scoring chances might be the best indicator but are scoring chances recorded around the league? I’d love to get my hands on a scoring chances database because its probably the best measure of a player. +/- has too little data, Corsi doesn’t account for shot quality, but the scoring chances porridge is juuuuust right.

by ThrashersRecaps on Jan 15, 2010 2:55 PM EST up reply actions  

Well, if every team’s blogger tracked scoring chances (hint, hint) then we’d know :)

by Hawerchuk on Jan 15, 2010 4:38 PM EST up reply actions  

Would this justification also apply to QoT?

by mepex on Jan 15, 2010 11:28 AM EST reply actions  

yeah, should be the same.

by Hawerchuk on Jan 15, 2010 12:18 PM EST up reply actions  

From having watched the Oilers this season the guy who should absolutely be at the top of the list is Shawn Horcoff and, unsurprisingly, he is in both systems. Beyond that there have been a variety of players that have been getting “tougher” minutes but more often because of the choices of the opposing coach with Pat Quinn mostly rolling lines. As such, I’d expect guys like Penner, Gagner and Hemsky to rank higher. Both Penner and Gagner get a bit of a bump using the Corsi metric while Hemsky takes a big fall. All of that said, at least for the forwards, the “top eight” and the “bottom five” look right to me with Nilsson poised to make some gains. On defence the picture is quite similar with Quinn mostly rolling the pairs one after the other without specifically looking for matchups. I’m a bit surprised that the difference between the six guys is as big as it is but certainly Strudwick belongs on the bottom.

by Scott Reynolds on Jan 15, 2010 12:45 PM EST reply actions  

Not all shots are created equally, so I like your idea of including shot quality, if that’s at all possible. That ought to address Tom’s concern.

As long as we’re opening up the hood, how about breaking down QualComp into an offensive and defensive component?

I’d also find a home/road breakdown interesting since your coach gets to match lines in home situations, but its up to the opposing coach in road situations.

by Rob Vollman on Jan 15, 2010 1:31 PM EST reply actions  

I would like to see one Qual Comp for Corsi, one for Shot Quality and one that combines them (like an estimated goals/60 or something). The reason being is SQ has some very big flaws but it’s still a very useful stat. However due to it’s flaws I’d like to have the option of seperating or including it.

by Moneypuck on Jan 15, 2010 1:35 PM EST reply actions  

Staios’ qualcomp rank should be 6th

by Tommelot on Jan 15, 2010 1:38 PM EST reply actions  

he’s tied with Souray

by Hawerchuk on Jan 15, 2010 2:13 PM EST up reply actions  

Islanders Corsi QualComp

The Islanders Corsi Qual Comp appears to be superior to their regular QualComp….for example, ,the Isles’ typical 4th line, an atrocious line of Richard Park, Tim Jackman, and Nate Thompson is considered by Scott Gordon (incorrectly) to be his best pure defensive line and thus starts the game and is on the ice when the puck is in the D zone vs good forwards.

But for the most part, the ratings appear similar really.

by garik16 on Jan 15, 2010 5:03 PM EST reply actions  

Gabe

Does the behindthenet.ca data still include empty net goals? Those really do add a lot of randomness to some of the stats imo.

Just generally, Corsi is a terrific proxy for scoring chance +/- at evens. Of course both are strongly indicative of a player’s ability to drive the play forward, i.e. to create meaningful possession and meaningful territorial advantage.

Corsi and scoring chances results for players are mired in context, and the ability to finish (shooting%) is soaked in luck. Like Keenan says, you need players who outchance and have a historical ability to finish. Those are close to his exact words by my memory.

If you can outchance at a good clip, you’ll have a good NHL career. But if you can’t finish, it will probably be as a checker. If you can bury your chances … you’ll probably get to play with guys who are good players, guys who drive scoring chance rates. And it’s probably worth you taking some risks with possession (cherry picking or trying to beat guys at the blue lines). If you give too much back the other way though … you’d better be able to finish to make up for it.

And if you’re that kind of player (prototypical eastern conference star player), that gives a lot back the other way (which has only a little to with being good defensively, of course) your coach probably isn’t going to be too interested in running you out there against guys who can outchance AND finish their chances (the prototypical western conference star player) because it probably won’t end well.

by Vic Ferrari on Jan 15, 2010 7:02 PM EST reply actions  

Vic,

Yeah, it still has ENG in it. Fixing that in my unwieldy scripts has now been on my to-do list for years.

Anyways, I think Sunny had a good idea here.

by Hawerchuk on Jan 15, 2010 7:25 PM EST up reply actions  

Tom

Corsi is useful because it is a measure of territorial advantage. Sample size has little to do with it. Also it removes much of the scorer bias (see corsi vs home-shots vs road-shots). It is a terrific proxy for scoring chances, nothing more or less.

So just using blocked and missed shots (on-ice). That will have a powerful positive relationship to scoring chances at EV, I haven’t checked with all the guys counting chances, but that will be the case I’m sure.

Just using blocked and missed shots (on-ice) for PK results shows a powerful negative realtionship to scoring chances. The opposite of what one would expect if sample size were Corsi’s most virtuous quality.

by Vic Ferrari on Jan 15, 2010 7:14 PM EST reply actions  

That is an awesome result. Why the negative relationship? The theory would be that the total number of shots a team can generate on a PP is more or less fixed, and therefore the most shots are blocked/missed the fewer are left to go to net?

by Tom Awad on Jan 15, 2010 10:31 PM EST up reply actions  

We’ll see when there is more data. For the Oilers so far the realtionship between shots against on the PK and scoring chances against is overwhelmingly high. If shots against beget scoring chances against beget goals against … even modelling that, I don’t think we’d quite end up there, so I doubt we’ll climb any higher.

At even strength the territorial advantage is a principle driver of results. That’s what corsi seems to do a better job of measuring than shots or zone time.

On the PK territorial advantage is a bit of a moot point. I suppose no matter how good you are, you’ll still end up playing the vast majority of your PK shift in your own end. Get in the shooting and passing lanes, that’s the lesson I think. 5v4 shots against rate seems to be the single measure of PK ability.

Again, I’ll wait to see the scoring chance data for all five teams being followed, but at this point it appears to be the case. The ability of players to affect shot quality, relative to teammates, seems like a vanishingly small quality. Same difference as EV that way, so I guess we shouldn’t be shocked.

by Vic Ferrari on Jan 16, 2010 12:28 AM EST up reply actions  

that should read “ability of players to affect on-ice shots AGAINST quality” appears to be tiny.

The population of forwards in the league show a significant difference in ability to affect EV on-ice shots FOR shooting%, though not shot quality (chances/corsi). At least not for the Oilers last season, So presumably the same applies to PP guys. I’ll guess that within the population there is a fairly wide spread in terms of the ability to affect on-ice shooting% rates on the PP, and a very narrow range of ability to create more scoring chances per shot. That’s just a guess, but that seems most likely. It’s just hockey. We’ll see.

by Vic Ferrari on Jan 16, 2010 12:35 AM EST up reply actions  

The population of forwards in the league show a significant difference in ability to affect EV on-ice shots FOR shooting%, though not shot quality (chances/corsi).

Vic, can you explain what you mean here? Are you saying that certain forwards can create more goals per shot attempt than other forwards, but they don’t get more scoring chances per shot attempt? I’m having a little trouble wrapping my mind around that.

The way I think of it, perhaps incorrectly, is that almost all shot attempts are scoring chances in the sense that they have some p > 0 of going in. Different shot attempts have a higher p-value than others. How wide the spread of p-values is, and how much control different forwards have over that spread, that’s what we’re talking about, right? But doesn’t G/Corsi (i.e. Corsi Shooting%) pretty much capture all of that?

by sunnymehta.com on Jan 16, 2010 4:03 PM EST up reply actions  

\like you, I think that territorial advantage is paramount, relected by corsi. And I don’t think that the “p” is being affected by the players on the ice that much. The bayesian prior that creates the best fit to affecting chances per corsi … it’s pretty much a vertical line, about the same as player’s ability to score more game winning goals than chance expects.

It looks like it comes down to the shooter. A particular shot fired at the net with a screen in front may have a probability of going into the net about 5% of the time in parallel universes … but that goes up if Lidstrom is taking the shot and Holmstrom is in front, and goes down if Smid is taking the shot and Jacques is in front.

Last year Zach Stortini and Strudwick averaged far more on-ice chances per shot than Hemsky or Gagner. Does that mean they’re more skilled? The correlation of chances-per-shot, for random 35 games to another random 35 games … essentially nil. And as I say, the alternative method yields the same result. So the only sensible conclusion is that happened to to luck alone.

Again though, that’s just one season for one team, but at the moment, to my mind … the arrows are pointing strongly in one direction.

So I suspect, that on-ice shooting% goes up alost exactly as much as you would expect it to, knowing the historical finishing ability of the guys on the ice (career shooting% nicked towards average a bit ). Not much else will be left. I haven’t run those numbers or anything, but that’s surely the way it will roll out.

You disagree?

You disagree.

by Vic Ferrari on Jan 16, 2010 4:37 PM EST up reply actions  

i agree with basically all of that

by sunnymehta.com on Jan 16, 2010 5:11 PM EST up reply actions  

Also sunny, if you take the on-ice shooting% from Gabe’s site (the missed shot inclusion washes out some of the scorer bias, but loses some ability as well, and the empty netter can really bugger it up for some guys … still, near enough). Grab them for guys who were on the ice fro a decent number of shots-for.

Use a beta distribution of with alpha=K*sp and beta=K*(1-sp).

guesstimate K as 300 to start, use that as you ability distribution for this group of players. Run a simulated season and see how the sctual results compare to you simulated season. Rinse and repeat a bunch of times. Correct K as necessary, try agsin.

Eventually you have a K that works best … try using that same K value and sim other seasons. At that point you’ll probably want to sit back and feel impressed with yourself :D

That’s not quite cricket, but near enough for what we’re doing now.

And if you use your new found ability distribution to calc the likelihood of on-ice shooting% for any player in any season, the less information you have on the player from previous seasons … the wider the range of likelihood. But you will have no excuse for not outguessing the messageboard fan consistently.

Makes sense, no?

by Vic Ferrari on Jan 16, 2010 4:51 PM EST up reply actions  

thanks, man. i’m gonna soak that up.

by sunnymehta.com on Jan 16, 2010 5:21 PM EST up reply actions  

Yeah, definitely a good idea. Gives us some sense of who was playing against good outchancers more. That would affect their own ability to outchance (and therefore presumably affect their own chance and corsi numbers).

Playing against Lecavalier, Kovalchuk, Heatley, Spezza, Malkin etc isn’t always a bad gig, as long as you’ve got some game yourself. Not as an opportunist, an as an outchancer. They give a lot back the other way. They can punish you when they get their chances, though.

The same thing would be terrific for QUALTEAM. Part of the reason the original QUALcomp works well is because, as Tom pointed out, it’s weighed against hundreds of other players, all of whom have different levels of luck in a season. The same can’t be said for QUALteam.

So Smid being awful without Visnovsky, but mad lucky … that makes him a high quality opponent for Iginla. But it’s only a small slice of the pie for Iggy’s QUALcomp. And of course Smid, and other similarly +/- lucky players are probably evened out by other unlucky +/- guys in Iggy’s opponent list.

The same can’t be said for teammates though. Smid is a fringe NHLer at best imo, but +/- makes him look like a star this year. The Corsi and +/- QUALteams would vary wildly I suspect, and the former would have 10x the value of the latter.

On the ENG thing … it’s a killer. It can really skew all the numbers that have goals as a component, which is a bunch. And it widens and right-skews the skill distribution for shooting% in a significant way. I should change my scripts to incorporate player positions … but my scripts are all uncommented and written with a heavy hand. So I’m hoping for you to do it at BTN. Laziness begets patience for me in this case. :D
Props for all the work you do on this stuff, and the insight you bring here as well.

by Vic Ferrari on Jan 15, 2010 9:21 PM EST reply actions  

D’Oh! I just realized the Quality of teammate stuff is there as well. Awesome stuff, guys.

Just going by my sense of it from watching the Oilers.

For the old system the Oilers numbers for Qualteam were crazy bad.

For the new system the Oilers numbers for Qualteam look very realistic, a completely new set of numbers.

by Vic Ferrari on Jan 16, 2010 2:29 PM EST reply actions  

The older QT stuff was screwed up bad by Horcoff’s +/-. He was considered one of the worst possible teammates on the club.

by Scott Reynolds on Jan 16, 2010 3:43 PM EST up reply actions  

Yeah, Scott, no doubt, but a bunch of other guys that seemed way out of whack, but rarely play with Horcoff … they moved as well. Moreau and Cogliano fell from high to low, Vis fell from high to middling, Smid stayed high, The rest of the D are now in middling range. Potulney dropped to near the bottom, Stone moved from dead last to above average …

All these things make perfect sense to all of us, I think.

by Vic Ferrari on Jan 16, 2010 4:01 PM EST up reply actions  

Comments For This Post Are Closed


User Tools

The finest Winnipeg Jets analysis on the internets

FanPosts


Managers

Hawerchuk_small Hawerchuk

Gary_bettman_bad_dreams_small Bettman's Nightmare

Grapes_small canadian texan

Howe_small TJCAPS

Editors

Ryan_small SO_RyanP

0_small maplestirup

Jets2_small arby_18