clock menu more-arrow no yes

Filed under:

Beating save percentage to death: Simulation

Using a psuedo-random number generator, I simulated up to 5 seasons of goalie save data.  Results after the jump.

 

Each season has 30 goalies. In the first scenario, I specified that all 30 goalies have an intrinsic save percentage (ISP) of 0.920. In the second scenario, I specified that 3 goalies have an ISP of 0.940, 7 have an ISP of 0.930, 10 have an ISP of 0.920, 7 have an ISP of 0.910, and 3 have an ISP of 0.900. In the third scenario I specified that 5 goalies have an ISP of 0.930, 5 have an ISP of 0.925, 10 have an ISP of 0.920, 5 have an ISP of 0.915, and 5 have an ISP of 0.910 For each scenario, I simulated 10,000 seasons. I ran a series of one-way ANOVAs testing for a difference in number of saves between goalies. I tested after 2 seasons, 3 seasons, 4 seasons, and all 5 seasons


Scenario 1 – Equal Goalies 1000 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

480

0.2206

3.6850

3

1.66

463

0.2491

3.0979

4

1.59

531

0.2272

3.0312

5

1.56

538

0.2675

2.4660


500 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

488

0.2357

3.6358

3

1.66

480

0.2789

3.0623

4

1.59

509

0.2632

2.7501

5

1.56

518

0.3115

2.6902


I chose an alpha of 0.05. Appropriately, about 5% of the test statistics exceeded the cutoff. None exceeded it by much, and the maximum calculated F goes down as I add subsequent seasons.


Scenario 2 – Unequal Goalies 1000 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

9990

1.3799

14.7428

3

1.66

10000

2.8665

15.1415

4

1.59

10000

4.0280

21.2827

5

1.56

10000

4.8464

19.0096


500 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

8994

0.8113

11.5604

3

1.66

9986

1.3541

11.3162

4

1.59

10000

1.9170

10.1930

5

1.56

10000

2.2067

12.8249


In the second scenario, with 1000 shots, essentially all the test statistics exceed the cutoff. Compared to scenario 1, there is little overlap by 3 seasons and none by 4 seasons. Even with only 500 shots in a season, essentially all the test statistics still exceed the cutoff. There is little overlap by 3 seasons, although complete separation is not reached by 5 seasons. Opposite to what we saw in scenario 1, as we add seasons the magnitude of the test statistics tends to go up.


Scenario 3 – Unequal Goalies 1000 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

6825

0.4595

7.6507

3

1.66

9619

0.9553

8.3440

4

1.59

9980

1.1465

9.6530

5

1.56

9998

1.3629

8.5005


500 shots per season

Seasons

F cutoff

# > F

Minimum F

Maximum F

2

1.85

3266

0.3604

6.2699

3

1.66

6576

0.6075

6.8692

4

1.59

8635

0.7319

5.3931

5

1.56

9561

0.8393

6.7881


In the third scenario, with 1000 shots, it takes 3 seasons of analysis before essentially all the test statistics exceed the cutoff. With only 500 shots in a season, it takes 5 seasons of analysis before essentially all the test statistics still exceed the cutoff. Complete separation from scenario 1 is not reached by 5 seasons. Compared to scenario 2, as we add seasons the trend in the magnitude of the test statistics is less apparent.


Conclusions

Obviously, more data is better. More shots per season and more seasons allow better discrimination of a goalie effect. The real world data has 267 goalie seasons with at least 1000 even strength shots, and 514 with at least 500 shots. Effects of the magnitude that I built into the models ought to be detectable in the real world data, but it might take 3-5 years of data.