## waP/60 A new approach to looking at points

While I was running a correlation study for another project I’m working on, I decided to quickly measure the correlation of many individual statistics and their contribution to team points percentage (P%), and an even strength individual points per 60 time on ice (P/60) inter-year, that is between years. Briefly looking over the list I noticed 2 interesting things. First, goals (EV G/60) and first assists (EV A1/60) were well correlated with points, 0.859, 0.819 respectively. However second assists (EV A2/60) lagged behind quite a bit limping in at 0.556. Given the substantial amount of emphasis everyone around the NHL places on the amount of points a player earns, I began to wonder if anyone had thought about how much each of these stats (G/60, A1/60, A2/60) should be weighted. Currently the traditional points treats them as the same, but even intuitively one can probably conclude that a goal scored by a player is much more predictive of a good player than a 2nd assist. In the next couple paragraphs I will outline the methods I used to reach my conclusions which you can skip ahead and read at the bottom.

Methods

Common abbreviations used

G/60 – even strength goals per 60 minutes ice time

A1/60 - even strength first assets per 60 minutes ice time

A2/60 - even strength second assists per 60 minutes ice time

P/60 - even strength points per 60 minutes ice time

AdjP/60 = G/60 + A1/60 even strength goals and first assists per 60 minutes ice time

waP/60 = G/60(1.44) + A1/60(1.32) + A2/60(.24) even strength goals, first assists, and second assists per 60 minutes ice time weighted by their predictive power.

I began with the hypothesis that totaling goals and first assists without 2nd assists would be a better predictor of future points than totaling goals, first assists, and second assists. Ultimately I knew using split-half season reliability (ala JLikens- using even and odd # games to correlate 2 or more variables) would be the best method, but with limited time I decided to use inter-year data from the 2006-2007 season through this year (2010-2011) with a minimum of 20 games played. This totaled 2040 player-seasons worth of data which I thought would be adequate. This sample may skew the results as players that are capable of 4 consecutive years of 20+ games played are probably above average NHL players.

My first thought was the throw out second assists completely, and see if that correlated with the previous and next year’s points. As is obvious when I looked at the data, it was hard to compare this new stat, I call Adj P/60 (adjusted by subtracting second assists) and points because I was reducing the point totals. Consequently I knew I needed to find a fudge factor to rectify this, and at the same time I decided to run a least squares regression with second assists thrown back in to see how much variance it accounted for. This gave me a new look at the data, with some interesting results. Shown below

 Regression Statistics Multiple R 0.761289354 R Square 0.57956148 Adjusted R Square 0.578941973 Standard Error 0.452955371 Observations 2040 ANOVA df SS MS F Significance F Regression 3 575.818501 191.9395003 935.5209835 0 Residual 2036 417.7232041 0.205168568 Total 2039 993.5417051 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.462361512 0.024887892 18.57776918 2.72626E-71 0.413553124 0.511169899 0.413553124 0.511169899 G/60 0.866300528 0.03124301 27.72781923 7.8447E-144 0.80502893 0.927572127 0.80502893 0.927572127 A1/60 0.819312773 0.041444731 19.76880413 1.03059E-79 0.738034275 0.900591272 0.738034275 0.900591272 A2/60 0.201343384 0.055619987 3.619982582 0.000301818 0.092265369 0.3104214 0.092265369 0.3104214

What I really found interesting is the coefficient column. This describes in general how much a variable influences points. That is if we ran the same regression with inter year data, each variable (goals, assistis) would be equal to 1 because every time you record a goal, you increase your points by 1. However here we see that simply isn’t the case. Goals and first assists seem to be much more predictive of points than second assists.

Following this data I decided to come up with a different stat. Instead of throwing out second assits completely, I would weight each variable accordingly to give the best R^2 to points. Thus the waP/60 (weight adjusted points per 60 minutes ice time) was born. Below you can find the regression statistics for its predictive power as well as for completeness I included points itself, and Adj P/60.

 Regression Statistics Multiple R 0.761146137 R Square 0.579343443 Adjusted R Square 0.579137036 Standard Error 0.452850439 Observations 2040 ANOVA df SS MS F Significance F Regression 1 575.6018717 575.6018717 2806.807394 0 Residual 2038 417.9398334 0.20507352 Total 2039 993.5417051 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.477112683 0.020184502 23.63757518 2.2774E-109 0.437528277 0.516697089 0.437528277 0.516697089 waP/60 0.614606873 0.011600885 52.979311 0 0.591856045 0.6373577 0.591856045 0.6373577

 Regression Statistics Multiple R 0.759490712 R Square 0.576826141 Adjusted R Square 0.576618499 Standard Error 0.454203396 Observations 2040 ANOVA df SS MS F Significance F Regression 1 573.1008278 573.1008278 2777.987466 0 Residual 2038 420.4408773 0.206300725 Total 2039 993.5417051 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.514341646 0.019667806 26.15144984 3.0988E-130 0.475770548 0.552912745 0.475770548 0.552912745 Adj P/60 0.867287228 0.016454997 52.70661691 0 0.835016861 0.899557595 0.835016861 0.899557595

 Regression Statistics Multiple R 0.746660459 R Square 0.55750184 Adjusted R Square 0.557284717 Standard Error 0.464458265 Observations 2040 ANOVA df SS MS F Significance F Regression 1 553.901329 553.901329 2567.668872 0 Residual 2038 439.6403761 0.21572148 Total 2039 993.5417051 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0% Intercept 0.355996698 0.023118934 15.3984912 1.0701E-50 0.310657494 0.401335902 0.310657494 0.401335902 P/60 0.746660459 0.014735119 50.67217059 0 0.717762994 0.775557923 0.717762994 0.775557923

You may notice that the first regression shows a slightly better correlation coefficient, and you would be right. When taking this data into account I also wanted it to be reliable (inter-year correlatability). I modified the formula slightly by again running R^2 values to derive the best weight. This adjustment didn’t alter the waP/60 predictive power of future points very much; it just boosted its reliability as a stat, which I thought to be very important.

I next thought to correlate this data with team points %, that is how likely players with high waP/60 are on good teams. My initial hope was a better correlated number than P/60. When the data came out and showed that it wasn’t, I was a bit disappointed, and thought that I might be stumbling into garbage in garbage out analysis.

 Team P% Team P% 1.000 +-ON/60 0.38955 CORSI ON 0.389077 Ozone% 0.224368 PDO 0.224032 Fin Ozone% 0.210028 Sv% 0.207835 P/60 0.099613 Sh% 0.095569 GP 0.092267 A2/60 0.089503 wP/60 0.08742 A1/60 0.077553 G/60 0.070729 CORSI REL -0.00046 RATING -0.00118

However I noticed that A2/60 correlates stronger with team P%, than goals and assists. It made me realize that this stat again is providing mis-information. It seems that the better team a player plays for is more that player is to have 2nd assists, that is to say 2nd assists are more correlated with team goals, and thus are highly influenced by the team a player plays for.

 wP/60 Team GF 0.117348 0.128756 0.155427 0.167829 0.145492

Discussion

Now that we have a stat that can predict future points, to a minor degree reduce team influence, as well as improved reliability we can look at some data to find some interesting conclusions. I decided to look at how players faired between the two stats, so I ranked them for each year accordingly. Here is the data.

Top 25 waP/60 players for 2007-2008

 NAME G/60 A1/60 A2/60 P/60 P/60 rank Adj P/60 wP/60 wP/60 Rank wP/60 Rank Chg wP/60 - P/60 SIDNEYCROSBY 1.41 1.49 0.47 3.38 1 2.9 4.106 1 0 0.726 EVGENIMALKIN 1.48 1.18 0.54 3.2 2 2.66 3.814 2 0 0.614 ALEXANDEROVECHKIN 1.68 0.97 0.35 3 5 2.65 3.777 3 2 0.777 JASONSPEZZA 1.13 1.39 0.41 2.93 7 2.52 3.557 4 3 0.627 MARIANGABORIK 1.36 1.09 0.33 2.77 17 2.45 3.472 5 12 0.702 DANIELALFREDSSON 1.36 1 0.77 3.12 3 2.36 3.46 6 -3 0.34 ILYAKOVALCHUK 1.67 0.73 0.31 2.72 22 2.4 3.436 7 15 0.716 JAROMEIGINLA 1.52 0.83 0.51 2.85 12 2.35 3.402 8 4 0.552 JEAN-PIERREDUMONT 1.09 1.3 0.49 2.88 8 2.39 3.401 9 -1 0.521 MAREKSVATOS 1.87 0.45 0.27 2.59 25 2.32 3.344 10 15 0.754 ALEXANDERRADULOV 1.2 1.09 0.57 2.86 9 2.29 3.301 11 -2 0.441 ALEXANDERFROLOV 1.06 1.23 0.47 2.76 18 2.29 3.26 12 6 0.5 DANYHEATLEY 1.37 0.84 0.74 2.95 6 2.21 3.256 13 -7 0.306 MATSSUNDIN 1.22 1.05 0.47 2.74 21 2.27 3.252 14 7 0.512 JOETHORNTON 0.87 1.4 0.58 2.84 13 2.27 3.239 15 -2 0.399 MIKERIBEIRO 1.09 1.15 0.63 2.86 10 2.24 3.237 16 -6 0.377 JASONPOMINVILLE 1.29 0.95 0.5 2.75 20 2.24 3.228 17 3 0.478 PAVELDATSYUK 0.92 1.33 0.51 2.76 19 2.25 3.201 18 1 0.441 DEREKROY 1.23 0.95 0.67 2.86 11 2.18 3.183 19 -8 0.323 HENRIKZETTERBERG 1.44 0.72 0.67 2.83 15 2.16 3.181 20 -5 0.351 DREWSTAFFORD 1.25 0.91 0.66 2.83 16 2.16 3.157 21 -5 0.327 PAULSTASTNY 1.34 0.7 1.08 3.12 4 2.04 3.111 22 -18 -0.01 JUSTINWILLIAMS 0.99 0.99 0.86 2.84 14 1.98 2.938 23 -9 0.098 BRADBOYES 1.64 0.38 0.33 2.35 36 2.02 2.936 24 12 0.586

Not too much movement in the top 25 though you can see from the waP/60 rank change column (this is the waP/60 rank – P/60 rank; and thus how a player moves up and down in ranking well looking at waP/60 as compared to P/60) certain players can be well over-valued as compared to other players that are undervalued. For example Marian Gaborik and Ilya Kovalchuk jumped of12 and 15 spots respectively, clearly indicating they were undervalued, as Dany Heatly and Paul Statsny were a bit overvalued.

Top 25 waP/60 players in 2008-2009

 NAME G/60 A1/60 A2/60 P/60 P/60 rank Adj P/60 wP/60 wP/60 Rank wP/60 Rank Chg wP/60 - P/60 ALEXANDERSEMIN 1.76 1.25 0.15 3.16 3 3.01 4.213 1 2 1.053 PHILKESSEL 1.73 0.83 0.26 2.82 13 2.56 3.642 2 11 0.822 RENEBOURQUE 1.69 0.76 0.76 3.2 2 2.45 3.614 3 -1 0.414 SIDNEYCROSBY 1.16 1.31 0.53 3 5 2.47 3.524 4 1 0.524 EVGENIMALKIN 0.78 1.66 0.63 3.07 4 2.44 3.465 5 -1 0.395 ALEXANDEROVECHKIN 1.57 0.81 0.48 2.86 12 2.38 3.44 6 6 0.58 DANIELSEDIN 1.19 1.19 0.59 2.97 7 2.38 3.423 7 0 0.453 MARTINHAVLAT 1.15 1.2 0.55 2.89 11 2.35 3.369 8 3 0.479 RICKNASH 1.39 0.96 0.32 2.67 19 2.35 3.34 9 10 0.67 DERICKBRASSARD 1.29 0.92 1.1 3.3 1 2.21 3.335 10 -9 0.035 JAMIELANGENBRUNNER 1.11 1.23 0.35 2.7 18 2.34 3.303 11 7 0.603 MARIANHOSSA 1.69 0.56 0.5 2.75 15 2.25 3.287 12 3 0.537 ILYAKOVALCHUK 1.48 0.82 0.31 2.61 24 2.3 3.282 13 11 0.672 MARCSAVARD 0.86 1.39 0.75 2.99 6 2.25 3.253 14 -8 0.263 ZACHPARISE 1.47 0.73 0.73 2.93 9 2.2 3.252 15 -6 0.322 PAVELDATSYUK 0.99 1.15 0.77 2.91 10 2.14 3.127 16 -6 0.217 ALEXEIPONIKAROVSKY 1.05 1.11 0.58 2.74 16 2.16 3.114 17 -1 0.374 JEFFCARTER 1.4 0.72 0.52 2.64 21 2.12 3.087 18 3 0.447 TIMCONNOLLY 1.12 1.02 0.51 2.64 20 2.14 3.079 19 1 0.439 DANIELBRIERE 1.23 0.88 0.53 2.63 22 2.11 3.057 20 2 0.427 COREYPERRY 1.15 0.99 0.36 2.5 31 2.14 3.045 21 10 0.545 DAVIDKREJCI 0.9 1.15 0.9 2.95 8 2.05 3.03 22 -14 0.08 JAROMEIGINLA 1.07 1.02 0.37 2.46 39 2.09 2.973 23 16 0.513 DAVIDBOOTH 1.12 0.94 0.5 2.56 28 2.06 2.971 24 4 0.411

Rick Nash, Phil Kessel , Jerome Iginla, and again Ilya Kovalchuk seem to benefit the most from this analysis, with David Krejci and Derik Brassard coming up well overvalued.

Top 25 waP/60 players in 2009-2010

 NAME G/60 A1/60 A2/60 P/60 P/60 rank Adj P/60 wP/60 wP/60 Rank wP/60 Rank Chg wP/60 - P/60 ALEXOVECHKIN 1.87 1.3 0.52 3.7 3 3.17 4.527 1 2 0.827 DANIELSEDIN 1.37 1.76 0.91 4.04 1 3.13 4.512 2 -1 0.472 SIDNEYCROSBY 1.68 1.14 0.59 3.41 4 2.82 4.06 3 1 0.65 HENRIKSEDIN 1.09 1.53 1.34 3.96 2 2.62 3.912 4 -2 -0.05 ALEXANDERSEMIN 1.74 0.93 0.52 3.19 5 2.67 3.852 5 0 0.662 ILYAKOVALCHUK 1.46 1.05 0.4 2.91 7 2.51 3.579 6 1 0.669 MARIANGABORIK 1.35 1.15 0.4 2.9 8 2.5 3.554 7 1 0.654 JOFFREYLUPUL 2.07 0.41 0 2.48 31 2.48 3.512 8 23 1.032 CHRISSTEWART 1.36 0.91 0.57 2.84 10 2.27 3.293 9 1 0.453 NICKLASBACKSTROM 1.03 1.17 0.83 3.03 6 2.2 3.226 10 -4 0.196 ERICFEHR 1.48 0.74 0.49 2.71 15 2.22 3.221 11 4 0.511 MIKEKNUBLE 1.48 0.77 0.32 2.57 25 2.25 3.219 12 13 0.649 PATRIKELIAS 1.15 1.07 0.54 2.75 14 2.22 3.195 13 1 0.445 SCOTTIEUPSHALL 1.54 0.58 0.58 2.7 16 2.12 3.118 14 2 0.418 FRAZERMCLAREN 0.44 1.77 0.44 2.65 18 2.21 3.076 15 3 0.426 WOJTEKWOLSKI 1.05 1.05 0.68 2.78 11 2.1 3.059 16 -5 0.279 ALEXBURROWS 1.34 0.75 0.59 2.68 17 2.09 3.057 17 0 0.377 PATRICKMARLEAU 1.31 0.75 0.55 2.61 20 2.06 3.005 18 2 0.395 JOEPAVELSKI 1.4 0.67 0.27 2.33 41 2.07 2.96 19 22 0.63 DUSTINPENNER 1.26 0.77 0.38 2.4 36 2.03 2.918 20 16 0.518 BRADRICHARDS 0.57 1.49 0.51 2.57 26 2.06 2.91 21 5 0.34 ZACHPARISE 1.3 0.67 0.62 2.59 21 1.97 2.902 22 -1 0.312 STEVENSTAMKOS 1.23 0.75 0.59 2.56 27 1.98 2.9 23 4 0.34 PATRICKKANE 1.02 0.97 0.58 2.58 23 1.99 2.886 24 -1 0.306

Some more movement now with lots more jumping around. Joffrey Lupul, and Joe Pavelski come in the most undervalued. Not much in the way of overvalued players this year.

Top 25 waP/60 players in 2010-2011

 NAME G/60 A1/60 A2/60 P/60 P/60 rank Adj P/60 wP/60 wP/60 Rank wP/60 Rank Chg wP/60 - P/60 SIDNEYCROSBY 1.94 1.36 0.68 3.98 1 3.3 4.746 1 0 0.766 DANIELSEDIN 1.24 1.24 0.7 3.17 2 2.48 3.588 2 0 0.418 STEVENSTAMKOS 1.39 1.01 0.43 2.82 8 2.4 3.433 3 5 0.613 ALESHEMSKY 1.11 1.3 0.46 2.88 5 2.41 3.422 4 1 0.542 PAVELDATSYUK 1.09 1.17 0.62 2.89 3 2.26 3.261 5 -2 0.371 RICKNASH 1.42 0.79 0.62 2.84 7 2.21 3.232 6 1 0.392 DAVIDKREJCI 0.71 1.54 0.65 2.89 4 2.25 3.211 7 -3 0.321 DEREKROY 1.01 1.27 0.25 2.53 22 2.28 3.187 8 14 0.657 DANIELCLEARY 1.36 0.86 0.29 2.5 24 2.22 3.158 9 15 0.658 MILANLUCIC 1.46 0.64 0.64 2.75 11 2.1 3.097 10 1 0.347 DANIELBRIERE 1.42 0.71 0.49 2.62 14 2.13 3.095 11 3 0.475 CLAUDEGIROUX 0.79 1.36 0.62 2.77 10 2.15 3.081 12 -2 0.311 JONATHANTOEWS 1.06 1.11 0.37 2.54 20 2.17 3.077 13 7 0.537 MATTCALVERT 1.44 0.64 0.48 2.56 19 2.08 3.029 14 5 0.469 ALEXOVECHKIN 1 1.1 0.5 2.59 16 2.1 3.01 15 1 0.42 MARTINST.LOUIS 1.28 0.77 0.61 2.66 13 2.05 3.003 16 -3 0.343 ALEXANDERSEMIN 1.36 0.72 0.29 2.37 37 2.08 2.973 17 20 0.603 MICHAELGRABNER 1.44 0.62 0.34 2.4 35 2.06 2.968 18 17 0.568 HENRIKSEDIN 0.51 1.54 0.82 2.87 6 2.05 2.966 19 -13 0.096 ANZEKOPITAR 0.89 1.16 0.63 2.68 12 2.05 2.963 20 -8 0.283 JEFFCARTER 1.48 0.51 0.62 2.61 15 1.99 2.949 21 -6 0.339 TOMASFLEISCHMANN 0.81 1.31 0.2 2.32 45 2.12 2.941 22 23 0.621 MARTINHAVLAT 1.01 1.06 0.37 2.45 30 2.07 2.939 23 7 0.489 BOBBYRYAN 1.21 0.82 0.48 2.51 23 2.03 2.936 24 -1 0.426

Again we see some similar results from previous years. Big movers this year include Semin, Roy, and Grabner. Interestly Henrik Sedin seems to be very overvalued.

Conclusions

Although there is not a massive amount of movement, I still feel waP is a better metric when looking at points. In subsequent studies using a larger data set including intra-year correlations might very well tease out some of this information even more, and get a better weighted formula for points. For now I think this is a good observation from a long overdue adjustment.

If this FanPost is written by someone other than one of the blog's editors, the opinions expressed in it do not necessarily reflect those of this blog or SB Nation.