Fancystats Fundamentals and Why Hockey Desperately Needs a Better Competition Metric (Part 1 of 2)

Hockey needs a better competition metric – without it, the value of fancystats for the evaluation of individual players is significantly weakened.

Let me tell you why!

A Background Tutorial

I’m assuming most of the people reading an article with this title are probably familiar with fancystats. But I’m hoping there are a few readers who are a little nervous around fancystats – and I’m hoping I can capture your interest too.

On that note, I’m going to do a bit of a grounds-up tutorial on fancystats to set the stage here in part 1, then get into the meat of discussion on competition in part 2. I hope those of you with lots of knowledge in that arena already will bear with me – or feel free to skip ahead!

Fancystats – Counting the Good and the Bad

Personally, I’m always a bit baffled by the hatred and contempt for the most common shot metric fancystats (the ones with the odd names like Corsi and Fenwick).

Here’s the thing: it’s not like these metrics are measuring anything unrelated to hockey. In fact, they’re measuring something fundamental to hockey, which is shots. No shots, no goals. No goals, no wins. No wins … sucks.

I like to think of it this way. If my team has the puck and is shooting the puck at the other guys net, this is almost without exception a good thing. Sometimes it’s a tiny good thing, sometimes it’s a major good thing, but it’s pretty much always a good thing.

Conversely, if the bad guys have the puck and are shooting it at my net, it is almost without exception a bad thing. Sometimes it’s a tiny bad thing, sometimes it’s a major bad thing, but it’s pretty much always a bad thing.

In the end, these ‘fancy’ stats are not fancy, and they’re not really even stats! We’re just counting up good things and bad things and seeing whether our team had more or less of those things. The only wrinkle to note is that we focus these counts on even strength (5v5) time. Not that these shot attempt good and bad things don’t matter on the PK or the PP, it’s just that there are other arguably better ways to measure effectiveness on special teams.

We usually express our resulting good/bad count as a percent – 50% means the two were even, 45% means our team is at a 5% deficit in good things, and 55% means our team has a 5% advantage in the good things we counted (the percentages are always expressed from the viewpoint of a specific team).

How do we know this good thing/bad thing has value? Well, there has been a ton of work done by the math guys to show that these things we’re counting have a ton of value in terms of measuring repeatable skill, and in terms of predicting the rest of the season, or the playoffs, or next season. Better than almost anything else we can count, even goals! If you want to get into the details, there is a ton of information out there on Google.

But you can ignore that if you want – at the core, just remember that we’re counting good and bad hockey-related things, and unsurprisingly, good teams tend to have way better counts, because they’re simply better at that whole hockeying thing than are bad teams.

That’s it! You are now conversant with the big bad Corsi! Welcome to the dark side.

Breaking It Down to a Whole Nuvver Level

It’s a slightly more complicated picture when we try and apply this concept to individuals however.

We’re still counting good and bad things as they happen, but now we’re counting them in the context of the five players on the ice. At the end of the [game, series, season], we count up the good and bad things that happened while each player was on the ice, and Bob’s your uncle: player Corsi.

It’s one thing to calculate that number, though. This ‘five player on the ice’ thing means it does get a little tricky though when it comes to using the numbers to evaluate an individual player.

By tricky, I do mean tricky, not impossible. I emphasize this point because there are some folks out there, some even with a massive media platform, who dismiss these stats as exclusively team-level stats, not applicable to individual players.

Unfortunately, this is wrong, and all it demonstrates is a (sometimes profound) lack of knowledge about basic statistics. Modern statistical methods are all about looking at a large mixed dataset and teasing out individual effects.

As it happens, once you get a large volume of these good/bad counts for an individual player, that player will have played with such a large number of teammates rotating through that his number does indeed start to reflect his individual contribution.

It’s why this ‘team level’ stat is almost always different, in many cases radically so, between teammates.

And yet it’s still tricky. Why?

The Unbearable Difficulty of Context

What makes it tricky is that in order for a player’s Corsi number to make sense, to have it give us a believable gauge of how that individual is doing, we need to understand the context in which that number was generated.

At the individual level in hockey, the most important context is provided by:

teammates
zone starts
competition

They’re important, because each of them drastically affect the count of good/bad things. Again, this is hockey, not stats. Who you play with, where you start, and who you face makes a huge difference to your success.

WOWY, Lookit Those Teammates

Wait a second, sez you, didn’t you just tell me that the teammate issue sorts itself out when you have lots of data?

Well, mostly it does. Those teammates do rotate a lot, and do allow us to get a better picture of the individual. What confounds us is not the other four skaters out on the ice, though, it’s usually just one, maybe two.

Players on the ice tend to have their ice time occur in tandem with one other player, more heavily than anyone else.

For defenders, this should be obvious – that other player is the D partner.

Less obvious is that when you look at time on ice breakdowns, forwards often show much the same pattern. With a few rare and exceptional lines (e.g. The Kane line in CHI), the third player on a line usually changes more often than the other two. Defense pairs rotate too. Maybe injury, maybe the coaches blender, but forward pairs do tend to stand out.

Luckily, we have a tool that helps deal with this scenario. It’s called WOWY – without you with you.

The idea is simple – given two players, Frank and Peter we’ll call them, take a look at how things (our count of good and bad things) went with Frank and Pete on the ice together, then when Frank was on the ice but without Pete, and then when Pete was on the ice without Frank.

Sometimes you have to dig a bit deeper, such as if Frank and Peter play with radically different levels of skill when apart, like Frank gets Taylor Hall and Peter gets Taylor’s New Jersey roommate, Luke Gazdic.

But usually the two players tend to separate, and you see quality differences quite quickly.

WOWY analysis is always useful when looking at players, and in my opinion, is mandatory when trying to assess defensemen.

Corsi, or the tale of good and bad hockey things. WOWY, the tale of Frank and Peter.

Remember those, and you are well on your way to being a fancystats expert. Well done!

I’m In Da Zone

OK, so we’ve got a handle on teammates.

The second aspect of context we talked about is how the coach uses a player – whether they’re starting in their own zone a lot, or gifted offensive zone time, or neither (or both).

Turns out, this doesn’t matter nearly as much as you’d think, for a few reasons:

Most players are not that buried – even “25% offensive zone starts”, which seems like a harsh number, often represents something in the order of 2 O and 6 D faceoffs during a game. Yes it can add up, but it’s still in the end just four zone starts more in the d zone. Not that much in the context of 20 or more shifts per game.
Most shifts start on the fly, not with a faceoff. So a players ability drives defensive (or offensive) zone starts to a large extent, not the other way round. Put another way, good players tend to force faceoffs in the o zone, and bad players tend to get stuck in the d zone, and faceoffs (or goals against!) are part and parcel. So good or bad zone starts can be a symptom rather than a cause of good or poor numbers.
Faceoff wins are generally around 50% give or take a few points. Think of the four d zone starts from the first bullet point. Now remember that the two teams are going to split those somewhere between 45 and 55%. That’s basically two faceoffs that are a problem, in the context of 20 shifts. Zone start differences diminish rapidly when you start cutting them in half.

Rather than go farther on this topic, I’ll recommend you read this two-part article by Matt Cane:

In summary: there’s reason to believe that zone starts affect a player’s numbers less than you’d think; and when they do – we have an idea of how much, and can adjust for them.

Competition

So of our three critical contextual factors, we’ve talked about two of them: teammate effects (for which we have WOWY), and zone starts (which aren’t a strong as most think, and can be adjusted for in any case).

What about competition?

Well, now things get peachy … by which I mean juicy and somewhat hairy.

Watching games, you can see coaches scrapping to get the right players on the ice against the other teams players. Checking line to shutdown the big line? Or go power vs power? What about getting easy matchups for that second line? That’s the chess game in hockey, though some coaches are clearly playing checkers.

On-ice competition is a big deal, and a critical part of measuring players. A player with 50% good/bad things is doing great if he’s always facing Sidney Crosby, and incredibly poorly if he’s facing Lauri “Korpse” Korpikoski.

How do we get a handle on that?

We’ll talk in depth about competition and how we (fail to) measure it in Part 2 of this article.