Filed under:

# Beating save percentage to death: Simulation

Using a psuedo-random number generator, I simulated up to 5 seasons of goalie save data.  Results after the jump.

Each season has 30 goalies. In the first scenario, I specified that all 30 goalies have an intrinsic save percentage (ISP) of 0.920. In the second scenario, I specified that 3 goalies have an ISP of 0.940, 7 have an ISP of 0.930, 10 have an ISP of 0.920, 7 have an ISP of 0.910, and 3 have an ISP of 0.900. In the third scenario I specified that 5 goalies have an ISP of 0.930, 5 have an ISP of 0.925, 10 have an ISP of 0.920, 5 have an ISP of 0.915, and 5 have an ISP of 0.910 For each scenario, I simulated 10,000 seasons. I ran a series of one-way ANOVAs testing for a difference in number of saves between goalies. I tested after 2 seasons, 3 seasons, 4 seasons, and all 5 seasons

Scenario 1 – Equal Goalies 1000 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 480 0.2206 3.6850 3 1.66 463 0.2491 3.0979 4 1.59 531 0.2272 3.0312 5 1.56 538 0.2675 2.4660

500 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 488 0.2357 3.6358 3 1.66 480 0.2789 3.0623 4 1.59 509 0.2632 2.7501 5 1.56 518 0.3115 2.6902

I chose an alpha of 0.05. Appropriately, about 5% of the test statistics exceeded the cutoff. None exceeded it by much, and the maximum calculated F goes down as I add subsequent seasons.

Scenario 2 – Unequal Goalies 1000 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 9990 1.3799 14.7428 3 1.66 10000 2.8665 15.1415 4 1.59 10000 4.0280 21.2827 5 1.56 10000 4.8464 19.0096

500 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 8994 0.8113 11.5604 3 1.66 9986 1.3541 11.3162 4 1.59 10000 1.9170 10.1930 5 1.56 10000 2.2067 12.8249

In the second scenario, with 1000 shots, essentially all the test statistics exceed the cutoff. Compared to scenario 1, there is little overlap by 3 seasons and none by 4 seasons. Even with only 500 shots in a season, essentially all the test statistics still exceed the cutoff. There is little overlap by 3 seasons, although complete separation is not reached by 5 seasons. Opposite to what we saw in scenario 1, as we add seasons the magnitude of the test statistics tends to go up.

Scenario 3 – Unequal Goalies 1000 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 6825 0.4595 7.6507 3 1.66 9619 0.9553 8.3440 4 1.59 9980 1.1465 9.6530 5 1.56 9998 1.3629 8.5005

500 shots per season

 Seasons F cutoff # > F Minimum F Maximum F 2 1.85 3266 0.3604 6.2699 3 1.66 6576 0.6075 6.8692 4 1.59 8635 0.7319 5.3931 5 1.56 9561 0.8393 6.7881

In the third scenario, with 1000 shots, it takes 3 seasons of analysis before essentially all the test statistics exceed the cutoff. With only 500 shots in a season, it takes 5 seasons of analysis before essentially all the test statistics still exceed the cutoff. Complete separation from scenario 1 is not reached by 5 seasons. Compared to scenario 2, as we add seasons the trend in the magnitude of the test statistics is less apparent.

Conclusions

Obviously, more data is better. More shots per season and more seasons allow better discrimination of a goalie effect. The real world data has 267 goalie seasons with at least 1000 even strength shots, and 514 with at least 500 shots. Effects of the magnitude that I built into the models ought to be detectable in the real world data, but it might take 3-5 years of data.