The All-Time Multi-League GVT database

COLOGNE, GERMANY - MAY 23: Denis Grebeshkov (R) of Russia and Jaromir Jagr (L) of Czech Republic battle for the puck during the IIHF World Championship gold medal match between Russia and Czech Republic at Lanxess Arena on May 23, 2010 in Cologne, Germany. (Photo by Martin Rose/Bongarts/Getty Images)

 

I guess it was only a matter of time until I put this together.

On Google documents (found here) I have placed an all-time GVT database that covered the NHL, WHA, AHL, IHL, the Swedish, Finnish, Russian, Czech and German elite leagues, as well as some of the Canada Cups and Olympic tournaments. More after the jump.

There is both a text file, multi_gvt.txt, and an Excel file, multi_gvt2.xls. The text file contains all the players; the Excel file only contains the first several hundred, as the site didn’t allow me to upload spreadsheets larger than ~10 MB (!). You can download the text file then paste it into an Excel spreadsheet.

Calculating the GVT data itself was fairly simple, although for many of these leagues the data I was able to find was quite rudimentary: Games, Goals and Assists. For recent years, goaltender shots against and player plus/minus are available, which at least gives us a skeleton of defensive information to work with. For now, I have restricted myself to seasons for which I had enough data to work with and in which there is enough overlap between the various leagues that I can be reasonably confident of my normalization rates. I’ve started in 1983 for the AHL, 1988 for the Swedish Elitserien, 1989 for the Finnish SM-liiga, 1992 for the IHL, 1999 for the Czech Republic League, 2001 for the German Deutsche Eishockey League and 2003 for the Russian Elite League / KHL. I have also included all the Canada Cups, World Cups and the 2006 and 2010 Olympics. More seasons will get added as I find the time.

The process of normalization, of course, is the most interesting part. GVT naturally normalizes to 3 goals a game, and I normalized goals and assists in the same way. I also normalized for schedule length, although because of the huge disparity in schedule lengths between various leagues, and the different levels of variance inherent in each, I could not normalize every league to 82 games: what would I do with an 8-game Olympic tournament? My compromise was to normalize versus a minimum of 70 games, so if a league had a 50-game season, I normalized the games played to 50 / 70 * 82 = 59 games. It’s not perfect, but it’s close enough.

The second part was to normalize for league difficulty. Past approaches, especially the most well-known ones by Hawerchuk normalize by games played. This is the correct approach when doing projections for the majority of players, as good players in lower leagues will often be marginal players in the NHL. However, I chose to use a translation system that was more accurate for elite players; to do this, I needed to normalize by ice time instead of by games played. Obviously, I don’t have ice time numbers for any league but the NHL, but I do have a simple algorithm to estimate ice time based on basic statistics. It’s not perfect, but it’s close enough. The upshot is that my normalization factors are slightly higher than Hawerchuk’s and track good players better but weaker players worse.

I’ll be adding more information on this database in the coming weeks. For now, numbers junkies, enjoy!

X
Log In Sign Up

forgot?
Log In Sign Up

Forgot password?

We'll email you a reset link.

If you signed up using a 3rd party account like Facebook or Twitter, please login with it instead.

Forgot password?

Try another email?

Almost done,

Join Arctic Ice Hockey

You must be a member of Arctic Ice Hockey to participate.

We have our own Community Guidelines at Arctic Ice Hockey. You should read them.

Join Arctic Ice Hockey

You must be a member of Arctic Ice Hockey to participate.

We have our own Community Guidelines at Arctic Ice Hockey. You should read them.

Spinner.vc97ec6e

Authenticating

Great!

Choose an available username to complete sign up.

In order to provide our users with a better overall experience, we ask for more information from Facebook when using it to login so that we can learn more about our audience and provide you with the best possible experience. We do not store specific user data and the sharing of it is not required to login with Facebook.

tracking_pixel_9355_tracker