All posts by G Money

Will Kris Russell help?

Hall’s gone. Yak’s gone. Devalued then shipped off.

But that’s water under the bridge.

There’s something else under the bridge that I’m much more worried about, even though he’s only 5′ 10” and about 170 lbs.

You Complete Me

That something is Kris “The Holy Terrier” Russell. (That’s not his real nickname, I made it up. It was either that or “Blockhead” and that one seemed unnecessarily mean)

So … Kris Russell is the guy Chia has decided will ‘complete’ the Oilers defense.

Fan – freakin’ – tastic.

I already see people in my Twitter feed rationalizing the deal.

“Well, maybe he’s not so bad.”

“OK, I think he’s a decent third pairing guy.”

“He scores points though, right?”

“Blocks a lotta shots man!”

Is it true? How good or bad is Russell?

Russell! Somebody gonna get a seen real bad

Without having previously done any detailed digging on the player, my impressions nonetheless have not been positive.

A. I’ve watched him play for the Flames. I wasn’t paying particular attention to him, but I was never impressed. There – the seen ‘im bad folks should be satisfied!

B. The Flames (and it pains me to say this) have two outstanding defensemen in Giordano and Brodie. They have a third defenseman who is pretty good in Dougie Hamilton. Ignoring handedness, all three of those guys would play on the Oilers easily, and probably with a lot of Top 4 ice time.

Yet somehow defensively the Flames are one of the worst teams in the league, giving up the fourth worst shot attempts rate in the league at even strength. Yes, that’s worse than Edmonton.

How is it possible they have such good defensemen and still suck defensively? Well, some of that is coaching (as an Oiler fan, I’ll miss Bob Hartley terribly). But based on watching the team (I live in Calgary, and my company has – had – corporate tickets, so I’ve seen more Flames games than I care to admit) a lot of it is the bottom 3 – Russell, Engelland, and Wideman just aren’t good defensemen, and they drag the team down.

Russell in particular dragged Hamilton down – if you dig into the stats (sorry eye test only folks) they were 5v5 CF 45% together, while Russell was 45.6% apart from Hamilton, and Hamilton was at 52% away from Russell.

In other words, if you think Hamilton had a bad year in Calgary, you’re wrong – he had a bad partner in Calgary that made it look like he had a bad year. And that partner is now an Edmonton Oiler.

MERITORIOUS!

But you know what? I’ve had bad (or good) opinions about players before, and then I’ve started digging into their actual results and been forced to change my mind. Maybe that will happen here. Let’s take a look.

I mentioned Russell’s results with and without (WOWY) Dougie Hamilton. Instead of just looking at a single WOWY, let’s look at my favourite WOWY visualization, which is from Micah Blake McCurdy’s hockeyviz.com. This lets us look at the entire pattern of how a player impacts the other players on the ice with him. Here’s Kris Russell in Calgary (feel free to skip to my explanation below):

Ouch. This is ugly. See how the blue area is clustered well below the red line? That’s bad. It means Russell’s shot results overall are well below breakeven … but we knew that already.

What makes it so much more alarming is that the black squares are mostly to the left and below the blue squares. That means Russell makes most of his teammates worse when he’s on the ice with them. And the red squares are mostly above and to the right. That means most of his teammates are better – in some cases, much better – without Kris Russell.

That’s as ugly as it gets.

Quality of Competition – the Wood and the Money

Now let’s take a look at the metric that @Woodguy55 and I developed, the “WoodMoney” metric. What this does is it isolates performance vs specific matchups. We look at how the player did when facing the 60 best players in the league, and also vs the middle and the ‘gritensity’ players. (You can find the specific lists, and details on how WoodMoney is calculated, in this article).

Here’s a visualization of the results, but you can ignore this and just go to my interpretation if you like:

So what is this telling us?

1 – The relatively even TOI splits tell us Russell was regularly used as a Top 4 defenseman

2 – But his results were below breakeven against all levels of competition. This is a particularly terrible result, as actual Top 4 defensemen should have at least passable results against middle tier competition, and better than passable vs bottom tier competition.  His danger (shot distance and type) adjusted results were at least better than his raw shot results, that’s a small plus.

3 – His Rel results were below zero against all levels of competition. In other words, he did worse against all levels of comp than his teammates did against that same level of comp. This is an acceptable result for a bottom pair defenseman, but not for a second pair defenseman.

It’s possible to see bad results like this and have reasons for it – Adam Larsson for example doesn’t look great, but then you have to account for his zone start usage, which is brutal. This isn’t the case for Russell, who, if anything, got a pretty big push as far as offensive zone starts go.

So he’s generated these results while starting quite often in the offensive zone.

That’s bad.

These results say that Kris Russell is a third pair defenseman – but he’s not particularly good at it.

But … but … he scores!

Chia’s been looking for a powerplay quarterback.  So , maybe KR is an ‘offensive’ defenseman that will help the powerplay!

Will he?

Russell’s career point scoring is 177 points in 573 games, or 0.31 pts per game. By comparison, Andrej Sekera is at 0.33 pts/gm in his career, and last year with the Oilers he was 0.37 pts/gm.

As for the powerplay, only four of KR’s 19 points last year came on the man advantage, though in fairness, he was not used on the powerplay much.

That may be understandable though – the Flames powerplay put up shots and dangerous shots (per corsica.hockey’s expected goals metric) faster with Russell not on it, with Hamilton and Giordano the undisputed champs of the powerplay shot rate.

Scoring upgrade? Not so much.

Penalty Kill

I have heard Russell is a decent penalty killer. I’ve also read a tidbit that suggested that the penalty kill is 3x to 5x as important as the powerplay, because you can lose the game with a poor penalty kill, but you can’t win it with a good powerplay. I’m not so sure I believe this, but I might buy the idea that the value is not entirely symmetric, that maybe the PK is a bit more important.

So can Russell help the PK?

Maybe.

His shot attempts rate against (92 per hour) is right in the middle of the pack of the main defenders, and his expected goals rate (6.05) is also good. He’s certainly not bad at it … it’s hard to make the determination that he’s ‘good’ just from a couple of numbers like this but … hey, sure, why not, chalk one up for the little guy!

Penalties

One topic I’ve added to my defenseman analysis the last few months is a look at penalty differentials (call it the Gryba Clause).

How’s Russell doing vis a vis the zebras?

Hallelujah! Our first big plus. Russell runs at a +0.38 penalties/60 rate. In other words, he draws more penalties than he takes. This is actually quite rare for a defenseman. So while his results are poor, at least his team gets a few extra powerplays as a result of his play.

That’s good.

Conclusion

Here’s Kris Russell in a nutshell:

  • Shots wise, he gets snowed under by all levels of competition. Even the third pairing doesn’t look like it will be sheltered enough for him to be able hold his own.
  • He drags most of his teammates down with him
  • Russell is not a true offensive defensemen. He scores on par with, or maybe a little less than Andrej Sekera. But Sekera (by eye and by stat) is much better defensively
  • Despite the Oilers’ dire need for a powerplay QB, Russell probably won’t help the powerplay
  • Although he should help the penalty kill
  • He might be able to draw more penalties than he takes

Is he worth $3M?

Nope.  He’s not even the third best lefty at this point.  I’d take Klefbom, Sekera, and Davidson over him in a heartbeat, in almost any situation.

And being that he adds to this crowd on the left side, if he forces one of our good LHD to the right side (where we know they were less effective last year), then his negative impact won’t just be on his own pairing, it will negatively affect other pairings.

And if he’s the one moved to RHD and proves to be even less effective than he has been in the past on the LHD, that thought is frankly terrifying.

Unless Chia and TMc plan to dress seven defensemen every game, and only have Russell out on the penalty kill … well,this isn’t good. I wish I had something positive to say, I really do.  Contrary to popular opinion, I may regularly boil over with scathing criticism of Oiler management, but I want my team to win.

But digging into his results has not made me feel better about this signing … worse if anything.  I hear Russell is gritty, truculent, good in the room, and deliberately blocks shots with his balls.  I’ll cheer for Russell and the Oilers to succeed. But I do not believe this signing will help them succeed. This isn’t good at all.

An offhanded look at wingers and hands

The idea of defense handedness is now well established I think – we recognize that ‘off hand’ defensemen often pay a penalty in terms of effectiveness.

Click the pic and grab the new 16-bit Puljujarvi tee!

Since some wingers also play on their off hand (a right shot left wing, or a left shot right wing), a few weeks ago I was mulling the idea of winger handedness.

Conventional wisdom suggests that doing this allows for:

  • Greater effectiveness in the offensive zone (with the shot having a better angle to the net), but
  • Less effectiveness in the defensive zone, where stick-off-boards makes for tougher defending and zone exits.

Dallas Eakins even experimented with putting Nail Yakupov (left shot right wing) on the left wing to try to manage his defensive woes (though this just seemed to confuse the young lad more).

I began to wonder if anyone has looked more broadly to see whether off-hand wingers are demonstrably more or less effective than on-hand wingers.  I put the idea out on Twitter, asking if anyone knew of work that had been done in the area.

It sparked an interesting and widespread debate, but it appears that it’s a relatively understudied issue.  This work by Arik Parnass (just hired by COL) vis a vis the power play was interesting.

@behindthenet’s brief look at overall handedness found some interesting anomalies.

But other than that, I found little or nothing specific to wingers.

So … why not do some initial digging into the idea?  Start by comparing wingers on their on hands vs wingers on their off hands, and see if there is a meaningful result as far as differences in points or shots or offense/defense.

Data

To pull this data, I used as my starting point the NHL ‘statsapi’ JSON live feed data.

This is an unusual source of data in that most fanalysts scrape the NHL roster sheets for player data. However, I have found the NHL roster sheets problematic for identifying positions (for example, Jordan Eberle has been listed as a C since he entered the league), while the JSON data entirely by visual scan appears to be more accurate.  Plus it very conveniently embeds the required data on player handedness.

So I used that.

The rest of the data for the players (boxcars and shot metrics) are scraped from the more conventional NHL game sheet data sources. All data used is for the 2015-2016 season. Any errors therein are my own, unless in the NHL data.

Process

As a starting point, I decided I would look at just a handful of key data points for comparison:

  • points per game at even strength (EVP/Gm)
  • goals per game at even strength (EVG/Gm)
  • even strength shot attempts, percentage (CF%) as well as for and against rates per 60 (CF/60, CA/60) so as to be able to separate defensive and offensive effectiveness
  • My own “Dangerous Fenwick” statistic, a distance and shot type weighted danger metric.  Again, percentage and for and against rates.

I did not filter the wingers for TOI or games played – if a player appeared on any game roster in 2015-2016 listed as right or left wing, that player was included in my data set. Measuring the effectiveness of a group like this suggests that we should include the ones who were ‘cup of coffee’ or bottom of roster types.

The wingers were separated into four groups: Left Wing/Left Shot, Left Wing/Right Shot, Right Wing/Right Shot, and Right Wing/Left Shot.  I then pulled demographic data, specifically country of origin and primary team in 2015, for each group.

Country of origin became of interest when I noted that the two off-hand wingers on the Oilers (Nail Yakupov and Anton Slepyshev*) are both Russian. In the same way that there is a distinct American bias in right handed defensemen, I wondered if there is a geographic bias in the development of off hand wingers.

*”Slappy” is listed as LW by the NHL, but by recollection the Oilers used him as a right winger at times. Not sure which is correct. This is another reminder that no dataset is ever 100% accurate.  We rely on data volume to account for such natural variability.

Raw Counts

Left Wing / Left Shot 155
Right Wing / Right Shot 113
Left Wing / Right Shot 19
Right Wing / Left Shot 46

On hand wingers are clearly the most common situation. Left wingers outnumber right wingers, just as LHD outnumber RHD.

The most common role for off hand wingers is a left shot right wing.

Team by Team

Counts for the categories of wingers by team are as follows (maximums are highlighted):

Team LW/LS LW/RS RW/LS RW/RS On Hand Off Hand
ANA 5 0 0 6 11 0
ARI 7 1 2 3 10 3
BOS 2 0 1 4 6 1
BUF 7 1 0 3 10 1
CAR 8 0 0 1 9 0
CBJ 7 0 1 5 12 1
CGY 7 0 2 2 9 2
CHI 5 1 2 4 9 3
COL 5 1 2 3 8 3
DAL 4 1 1 3 7 2
DET 6 1 4 0 6 5
EDM 7 1 1 5 12 2
FLA 3 2 2 1 4 4
LAK 6 0 1 1 7 1
MIN 5 1 1 4 9 2
MTL 7 1 1 5 12 2
NJD 8 0 1 5 13 1
NSH 3 3 3 1 4 6
NYI 6 1 1 4 10 2
NYR 5 0 2 2 7 2
OTT 7 0 1 6 13 1
PHI 3 0 1 3 6 1
PIT 6 1 2 6 12 3
SJS 3 0 1 3 6 1
STL 3 0 5 3 6 5
TBL 4 0 2 4 8 2
TOR 4 2 3 6 10 5
VAN 6 0 2 7 13 2
WPG 3 0 1 8 11 1
WSH 3 1 0 5 8 1

Though the numbers vary across the league, few teams look overly unusual in usage.  St. Louis (unusually high off hand RW) and Carolina (just one natural RW) both stand out to me.

Country of Origin

The following chart shows country of origin for each of the four categories of wingers:

Any country that did not have at least 10% representation in any one category was lumped together into Team Europe. If it’s good enough for the World Cup, it’s good enough for me!

Canada produces disproportionately fewer left shot right wingers, while Russia and Sweden produce disproportionately more off hand wingers.

There is a clear geographic bias to off handedness in wingers.

Effectiveness

The performance of the different categories of wingers is summarized in the following table.

Category EVP/Gm EVG/Gm CF% CF/60 CA/60 DFF% DFF/60 DFA/60
LW/LS 0.33 0.14 49.1% 53.8 55.9 48.9% 38 39.8
RW/RS 0.32 0.14 49.5% 54.6 55.7 49.5% 38.2 39
LW/RS 0.37 0.18 50.6% 56.5 55.1 51.0% 39.1 37.5
RW/LS 0.38 0.16 50.6% 55.9 54.6 50.3% 39.4 38.9

* Technical note: The shot metrics are grouped i.e. the raw for and against counts are summed for each group, then divided by total (or summed EVTOI) to produce the percentage and the rates.

Notice something interesting: off hand wingers are producing slightly but distinctly better results than their on hand counterparts in every category. The off hand wingers score more points, more goals, and have better Corsi and Danger metrics.

This is the reverse of what is observed with defensemen, where being on the off hand typically carries a penalty.  With wingers, it appears to confer an advantage.

Furthermore, the improvement is not solely because of greater offensive impact (as you’d expect) – rather, both the for and against shot rates are slightly better.

This is not entirely expected.

Statistical Validity

Of course, the differences between each of these two groups, while distinct, may not be statistically valid given the inherent variance within the two groups.

To test this, I used a Welch’s t test (assuming independent samples with different variances) to compare each of the two groups of left and right wingers. I used CF% and points/game as the comparison statistics for test.

Note that the underlying data set for the purposes of this test treats each players seasonal results for CF% and points/game as one data point (different from the summary table above).

The results are somewhat counterintuitive:

Wing Metric tcrit p Note
Right CF% 11.87 2.48E-016 Highly significant difference between on and off hand RW
Left CF% -1.94 0.07 Significant at 10% level, not at 5% level, for off hand LW
Right Pts/60 0.89 0.38 Not significant
Left Pts/60 -1.12 0.28 Not significant

The only statistic that was significant at 95% confidence was CF% for right wingers. Off hand right wingers produce a higher CF%, and the result is statistically highly significant.

The CF% difference for off vs on hand left wingers was significant at the 90% confidence level, but not the 95% level.

That RW is significant and LW is not despite being similar in magnitude is likely entirely due to the sample size, given there are only 19 off hand left wingers.

If I had applied a TOI filter, I suspect this would likely reduce the variance observed in the data and may have effected significance as well.  (Next time)

The difference in points/game between on and off hand for both left and right wingers was not statistically significant.

Conclusion

  • At least one of the statistics show that there is a statistically significant difference between on hand and off hand wingers
  • In general, off hand wingers had overall numbers that showed them to be more effective than on hand wingers, even from a shot suppression point of view. This is surprising.
  • There is a distinct geographic bias in off hand winger country of origin, with Russia and Sweden being unusually common sources, while Canada is distinctly less proportionately likely to produce left shot right wingers.
  • It remains unclear, but it’s possible that a part of the reason for the better off-hand numbers may boil down to a handful of superstar players. Ovechkin is one. And one of the highest scoring lines in the league features two off-handers (Patrick Kane and geriatric Calder winner Panarin), which is itself a point of interest.

This is a 40,000 ft view of the topic, but after this initial look, I would conclude that a deeper study on the topic is definitely warranted.

Particularly in identifying whether the apparent effectiveness of off hand wingers is a broad effect, or a narrow one confined to a handful of top players. Or is it selection bias, where only the best off hand wingers get played on the ‘wrong’ side in the first place?  And if it is a broad effect, why does the difference manifest in both offensive and defensive zone shot metrics, and not just on the offensive side?

And from there, understanding ultimately whether the effect has any tactical or roster implications for NHL teams.

Addendum – List of Off Hand Wingers

Right Wing / Left Shot Left Wing / Right Shot
Kevin Hayes Anton Slepyshev
Nikita Soshnikov Austin Watson
Tom Kuhnhackl Josh Leivo
Tomas Jurco Taylor Beck
Emerson Etem Viktor Arvidsson
Nikolaj Ehlers Thomas Vanek
Mikko Rantanen Joffrey Lupul
Alexandre Burrows Blake Comeau
Sven Andrighetto Artemi Panarin
Dennis Everberg Craig Cunningham
Jaromir Jagr Evan Rodrigues
Loui Eriksson Shawn Thornton
Michael Frolik David Perron
Jiri Hudler Filip Forsberg
Josh Bailey Christian Thomas
Marian Gaborik Alex Ovechkin
Rene Bourque Teemu Pulkkinen
Tobias Lindberg John McFarland
Martin Havlat Patrick Sharp
Miikka Salomaki
Max McCormick
Michael Grabner
Vladimir Tarasenko
Gustav Nyquist
Joel Vermin
Brad Richardson
Barclay Goodrow
Pascal Dupuis
Valeri Nichushkin
Mats Zuccarello
Gabriel Bourque
Dmitrij Jaskin
Marian Hossa
Jordan Caron
Johan Franzen
Reilly Smith
Nail Yakupov
Jakub Voracek
Scottie Upshall
Tobias Rieder
Patrick Kane
Nikita Kucherov
Brian O’Neill
Anthony Mantha
James Neal
Nino Niederreiter

Is Antoine Vermette a fit for the Oilers?

Antoine Vermette

This is Antoine Vermette.

Vermette is a 34 year old centre who was just bought out by the Arizona Coyotes, which makes him an unrestricted free agent. Could he be a fit for the Oilers bottom 6?

To assess the player, let’s look at the results in four different areas of his game:

  • Boxcars – how much did he score?
  • Shot metrics WOWY – who did he play with, and how did he do?
  • Shot metrics vs competition – who (if anyone) did he win the shot battle against?
  • Faceoffs and zone starts – how did his coach use him, and how well did he do? Did his usage potentially impact his results?

Boxcars*

Vermette scored 17 goals with 21 assists in 76 games, playing between 16 and 17 minutes a night. His scoring rate of 0.5 pts/g is right in line with his career average of 0.512, which is damn impressive for a 34 year old! Lowetide has a more in depth look at Vermette’s scoring here.

Looking solely at scoring rate, Vermette would have been eighth among regulars on the Oilers – certainly on the surface a decent pickup for the bottom 6.

But let’s look below the surface.

*Boxcar data from hockey-reference.com

WOWY*

Vermette’s most common forward partner was Mikkel Boedker. Their numbers together (45.9%) are roughly in line with their numbers apart, with an edge to Vermette (46.9% vs 45.6%).

They were not good together – but they were not good apart either.

Vermette’s most common defensive partner was Oliver Ekman-Larsson. As OEL is the unquestioned #1D on the Coyotes, this surprised me somewhat.

What didn’t surprise me were the results. Together, they were a poor 46.5%. Vermette away from OEL remained a poor 46.2%, while OEL away from Vermette was a solid (especially for Arizona) 51.3%.

Vermette was a major anchor on OEL.

More broadly, if you look at this WOWY visualization (from hockeyviz.com):

I’d conclude that (shots wise) Vermette for the most part is at best neutral and more commonly a negative influence on his teammates.

*WOWY data from stats.hockeyanalysis.com

Shot Metrics and Competition

For this analysis, I’m using the WoodMoney metrics (read the background on this metric here). Here’s the WoodMoney visualization for Vermette, the “Vizmette” as it were:

(click to embiggen)

Bear in mind that at the moment, using WoodMoney for analyzing forwards should be done cautiously, as we’re only looking at F vs F matchups, while F vs D matchups (for forwards) likely matter for qualcomp at least as much, and almost certainly more.

That said, head to head forward matchups do have relevance, so there is valuable information buried in this group of charts. Here’s my read:

1 – Vermette’s TOI vs the competition bands is relatively balanced, suggesting a second/third line(ish) utility forward used up and down the lineup. His TOI and zone starts (see next section) would appear to confirm this, as he’s pretty much 50% in both total and true zone starts.

2 – By raw shots (Corsi, CF on the chart), he gets caved by all comp except the lowest tier labeled ‘Gritensity’ where he is slightly above breakeven.

3 – By Dangerous Fenwick (DFF, my own danger weighted shot metric statistic, details here) he gets caved by all comp except the lowest Gritensity tier, where he almost breaks even.

4 – His “rels” – that is to say, how he does relative to his teammates against the various competition bands, as shown on the last chart in the diagram – are below zero for all comp except vs Gritensity.

So … he’s playing on a pretty poor possession team (surprise), but at least you can say that possession wise he is still above average as a fourth liner on that team. He’s not being used as a fourth liner, but if he was, he’d be OK at it.

Faceoffs and Zone Starts*

Vermette’s faceoff % was 55.8%, a stellar number (and in line with his career). This is clearly a strength and probably helped him generate the numbers he did.

His offensive zone start is ~50%, slightly below his true zone start. So he was used very neutrally by the coach, confirming his role as something of a utility player.

*Faceoff percentage from hockey-reference.com. Zone start and true zone start data scraped from NHL play by play data.

Conclusion

Vermette’s poor possession results, especially against higher competition, clash with his decent boxcars.

Putting these together suggests that the way he scored is that he did so by giving up more than he got. You can see that in the wide gap between his ‘for’ and ‘against’ lines in both the raw metrics (CF) and especially the danger weighted metrics (DFF).

I suspect that his good boxcars are at least partially attributable to his faceoff prowess, which remains stellar.

Now this analysis is purely numerical and results-based. Perhaps a detailed video-based scouting project would shed light that might show he’s better (or worse) than what the numbers imply.

All in all?

I’d say stay away, regardless of price. He’s probably going to score more than Letestu (at least next year), but in terms of overall play, he gives up a lot more than he gets, even though he spends a fair bit of time with OEL.

In case you haven’t noticed, the Oilers do not have an OEL.

So Vermette is not likely (much of) an upgrade on Letestu. And at age 34, he’s about to hit the steep part of a downhill slope. These results suggest that his buyout is not that surprising.

The Oilers need an upgrade in the bottom 6.

Vermette isn’t the droid we’re looking for.

Hockey Desperately Needs a Better Competition Metric (Part 2 of 2)

EDMONTON, AB – OCTOBER 25: Connor McDavid #97 of the Edmonton Oilers battles for the puck against Drew Doughty #8 of the Los Angeles Kings on October 25, 2015 at Rexall Place in Edmonton, Alberta, Canada. (Photo by Andy Devlin/NHLI via Getty Images)

This article is part 2 of 2.

In part 1, I noted that using shot metrics for evaluating individual players is heavily influenced by teammates, coaches usage (zone starts), and competition*.

I believe we have decent tools for understanding the effect of teammates and zone starts – but I believe this is not at all true for competition metrics (dubbed QoC, or Quality of Competition).

And the reality is that understanding competition is critical to using shot metrics for player evaluation. If current QoC measures are not good, this means QoC is a huge weakness in the use of shot metrics for player evaluation.

I believe this is the case.

Let’s see if I can make a convincing case for you!

*Truthfully, there are quite a few other contextual factors, like team, and score state. These shot metrics have been around for a decade plus, and they’ve been studied (and are now often adjusted) heavily. Some of the effects that have been identified can be quite subtle and counterintuitive. From the point of view of assessing *a* player on *a* team, it doesn’t hurt us to focus on these three factors.

It Just Doesn’t Matter – You’re Kidding, Right?

If you bring up Quality of Competition with many fancystats people, they’ll often look at you and flat out tell you that “quality of competition doesn’t matter.”

This response will surprise many – and frankly, it should.

We know competition matters.

We know that a player is going to have a way harder time facing Sidney Crosby than facing Tanner Glass.

We know that coaches gameplan to face Taylor Hall, not his roommate Luke Gazdic (so long, lads). And they gameplan primarily with player matchups.

Are our eyes and the coaches that far out to lunch?

Yes, say the fancystats. Because, they say, when you calculate quality of competition, you just don’t see that much difference in the level of competition faced by different players. Therefore, so conventional wisdom dictates, it doesn’t matter.

The Numbers Suggest Matchups Matter

I don’t have to rely on just the eye test to contradict this line of thought – the numbers do the work too. For example, here are the head to head matchup numbers (I trot these out as a textbook example of coaching matchups) for the three Montreal defense pairs against Edmonton from the game on February 7th, 2016:

vs

Hall

McDavid

Subban-Markov

~ 3 mins

~ 10 mins

Petry-Emelin

~ 8 mins

~ 5 mins

Gilbert-Barberio

~ 40 seconds

~ 14 seconds

Does that look like “Quality of Competition” doesn’t matter? It sure mattered for both Hall and McDavid, not to mention all three Montreal defense pairs. Fifteen minutes vs 14 seconds is not a coincidence. That was gameplanned.

So how do we reconcile this?

Let’s dig in and see why maybe conventional wisdom is just plain wrong – maybe the problem is not with the quality of competition but the way in which we measure it.

It Would Hit You Like Peter Gabriel’s Sledgehammer

I’ll start by showing you an extremely valuable tool for assessing players in the context of zone starts and QoC, which is Rob Vollman’s Player Usage Charts, often called sledgehammer charts.

This chart is for Oiler defensemen in 2015-2016:

This shows three of the four things we’ve talked about previously:

  • The bubble colour (blue good) shows the shot metrics balance of good/bad for that individual
  • The farther to the right the bubble, the more faceoffs a player was on the ice for in the offensive zone – favourable zone starts or coaches usage in other words
  • The higher the bubble, the tougher the Quality of Competition

Notice something about the QoC though. See how it has such a narrow range? The weakest guy on there is Clendening at -0.6. The toughest is Klefbom at a shade over 1.0.

If you’re not familiar with “CorsiRel” (I’ll explain later), take my word for it: that’s not a very meaningful range. If you told me Player A has a CorsiRel of 1.0, and another has a CorsiRel of 0.0, I wouldn’t ascribe a lot of value to that difference. Yet that range easily encompasses 8 of the 11 defenders on the chart.

So no wonder the fancystatters say QoC doesn’t matter. The entire range we see, for a full season for an entire defensive corps worst to last, is a very small difference. Clendening basically faced barely weaker competition than did Klefbom.

Or did he?  That doesn’t sound right, does it?  Yeah, the Oiler D was a tire fire and injuries played havoc – but Todd McLellan wasn’t sending Clendening out to face Joe Thornton if he could help it.

To figure out what might be wrong, let’s dig in to see how we come up with these numbers that show such a thin margin of difference.

Time Weighs On Me

The process for calculating a QoC metric starts by assigning every player in the league a value that reflects how tough they are as competition.

Then when we need the QoC level faced by a particular player:

  • we look at all the players he faced, multiply (weight) the amount of time spent against that player with the competition value of that player
  • we add it all up, and presto, you have a QoC measure for the given player

Assuming that the time on ice calculations are reasonably fixed by, you know, time on ice, it should be clear that the validity of this QoC metric is almost entirely dependent on the validity of the ‘competition value’ assigned to each player.

If that competition value isn’t good, then you have a GIGO (garbage in garbage out) situation, and your QoC metric isn’t going to work either.

There are three different data values that are commonly used for calculating a QoC metric, so let’s take a look at each one and see if it meets the test of validity.

Using Corsi for Qoc

Many fancystats people who feel that QoC doesn’t matter will point to this post by Eric Tulsky to justify their reasoning.

Tulsky (now employed by the Hurricanes) is very, very smart, and one of the pillars of the hockey fancystats movement. He’s as important and influential as Vic Ferarri (Tim Barnes), JLikens (Tore Purdy), Gabe Desjardins, and mc79hockey (Tyler Dellow). So when he speaks – we listen.

The money quote in his piece is this:

Everyone faces opponents with both good and bad shot differential, and the differences in time spent against various strength opponents by these metrics are minimal.

Yet all that said – I think Tulsky’s conclusions in that post on QoC are wrong. I would assert that the problem he encounters, and the reason he gets the poor results that he does, is that he uses a player’s raw Corsi (shot differential) as the sole ‘competition value’ measure.

All his metric does is tell you is how a player did against other players of varying good and bad shot differential. It actually does a poor job of telling you the quality of the players faced, which is the leap of faith being made. Yet the leap is unjustified, because players of much, much different ability can have the same raw Corsi score.

To test that, we can rank all the players last season by raw Corsi, and here’s a few of the problems we immediately see:

  • Patrice Cormier (played two games for WPG) is the toughest competition in the league
  • He’s joined in the Top 10 by E Rodrigues, Sgarbossa, J Welsh, Dowd, Poirier, Brown, Tangradi, Witkowski, and Forbort.
  • Mark Arcobello is in the top 20, approximately 25 spots ahead of Joe Thornton
  • Anze Kopitar just signed for $10MM/yr while everyone nodded their head in agreement – while Cody Hodgson might have to look for work in Europe, and this will garner the same reaction. Yet using raw Corsi as the measure, they are the same level of competition (57.5%)
  • Chris Kunitz is about 55th on the list – approximately 40 spots ahead of Sidney Crosby
  • Don’t feel bad, Sid – at least you’re miles ahead of Kessel, Jamie Benn, and Nikita Nikitin – who is himself several spots above Brent Burns and Alex Ovechkin.

*Note: all data sourced from the outstanding site corsica.hockey. Pull up the league’s players, sort them using the factors above for the 2015-2016 season, and you should be able to recreate everything I’m describing above.

I could go on, but you get the picture, right? The busts I’ve listed are not rare. They’re all over the place.

Now, why might we be seeing these really strange results?

  • Sample size!  Poor players play little, and that means their shot metrics can jump all over the place.  Play two minutes, have your line get two shots and give up one shot, and raw Corsi will anoint you one of the toughest players in the league. We can account for this when looking at the data, but computationally it can wreak havoc if unaccounted for.
  • Even with large sample sizes, you can get very minimal difference in shot differential between very different players because of coaches matching lines and playing “like vs like”. The best players tend to play against the best players and their Corsi is limited due to playing against the best. Similarly, mediocre players tend to play against mediocre players and their Corsi is inflated accordingly. It’s part of the problem we’re trying to solve!
  • For that same reason, raw Corsi tends to overinflate the value of 3rd pairing Dmen, because they so often are playing against stick-optional players who are Corsi black holes.
  • The raw Corsi number is heavily influenced by the quality of the team around a player.

Corsi is a highly valuable statistic, particularly as a counterpoint to more traditional measures like boxcars. But as a standalone measure for gauging the value of a player, it is deeply flawed. Any statistic that uses raw Corsi as its only measure of quality is going to fail. GIGO, remember?

Knowing what we know – is it a surprise that Tulsky got the results he got?

So we should go ahead and rule out using raw Corsi as a useful basis for QoC.

Using Relative Corsi for QoC

If you aren’t familiar with RelCorsi, it’s pretty simple: instead of using a raw number, for each player we just take the number ‘relative’ to the teams numbers.

For example, a player with a raw Corsi of 52 but on a team that is at 54 will get a -2, while a player with a raw Corsi of 48 will get a +2 if his team is at 46.

The idea here is good players on bad teams tend to get hammered on Corsi, while bad players on good teams tend to get a boost. So we cover that off by looking at how good a player is relative to their team.

Using RelCor as the basis for a QoC metric does in general appear to produce better results. When you look at a list of players using RelCor to sort them, the cream seems to be more likely to rise to the top.

Still, if you pull up a table of players sorted by RelCor (the Vollman sledgehammer I posted earlier uses this metric as its base for QoC), again you very quickly start to see the issues:

  • Our top 10 is once again a murderers row of Vitale, Sgarbossa, Corey Power Potter Play, Rodrigues, Brown, Tangradi, Poirier, Cormier, Welsh, and Strachan.
  • Of all the players with regular ice time, officially your toughest competition is Nino Niederreiter.  Nino?  No no!
  • Top defenders Karlsson and Hedman are right up there, but they are followed closely by R Pulock and D Pouliot, well ahead of say OEL and Doughty.
  • Poor Sid, he can’t even crack the Top 100 this time.

Again, if we try and deconstruct why we get these wonky results, it suggests two significant flaws:

  • Coach’s deployment. Who a player plays and when they play is a major driver of RelCor. You can see this once again with 3rd pairing D men, whose RelCor, like their raw Corsi, is often inflated.
  • The depth of the team. Good players on deep teams tend to have weaker RelCors than those on bad teams (the opposite of the raw Corsi effect). This is why Nicklas Backstrom (+1.97) and Sam Gagner (+1.95) can have very similar RelCor numbers while being vastly different to play against.

RelCor is a very valuable metric in the right context, but suffers terribly as a standalone metric for gauging the value of a player.

Like raw Corsi, despite its widespread use we should rule out relative Corsi as a useful standalone basis for QoC.

Using 5v5 TOI for QoC

This is probably the most widely used (and arguably best) tool for delineating QoC. This was also pioneered by the venerable Eric Tulsky.

When we sort a list of players using the aggregated TOI per game of their “average” opponent, we see the cream tend to rise to the top even moreso than with RelCor.

And analyzing the data under the hood used to generate this QoC, our top three “toughest competition” players are now Ryan Suter, Erik Karlsson, and Drew Doughty. Sounding good, right?

But like with the two Corsi measures, if you look at the ratings using this measure, you can still see problematic results all over, with clearly poor players ranked ahead of good players quite often. For example:

  • The top of the list is all defensemen.
  • Our best forward is Evander Kane, at #105. Next up are Patrick Kane (123rd), John Tavares (134th), and Taylor Hall (144th). All top notch players, but the ranking is problematic to say the least. Especially when you see Roman Polak at 124th.
  • Even among defensemen, is Subban really on par with Michael del Zotto? Is Jordan Oesterle the same as OEL? Is Kris Russel so much better than Giordano, Vlasic, and Muzzin?
  • Poor old Crosby is still not in the Top 100, although he finally is when you look at just forwards.
  • Nuge is finally living up to his potential, though, ahead of Duchene and Stamkos!

OK, I’ll stop there. You get my point. This isn’t the occasional cherry picked bust, you can see odd results like this all over.

Looking at the reasons for these busts, you see at least two clear reasons:

  • Poor defensemen generally get as much or more time on ice than do very good forwards. Putting all players regardless of position on the same TOI scale simply doesn’t work. (Just imagine if we included goaltenders in this list – even the worst goalies would of course skyrocket to the top of the list).
  • Depth of roster has a significant effect as well. Poor players on bad teams get lots of ice time – it’s a big part of what makes them bad teams after all. Coaches also have favourites or assign sideburns to players for reasons other than hockeying (e.g. Justin Schultz and the Oilers is arguably a good example of both weak depth of roster and coach’s favoritism).

So once again, we find ourselves concluding that the underlying measure to this QoC, TOI, tells you a lot about a player, but there are very real concerns in using it as a standalone measure.

Another problem shows up when we actually try to use this measure in the context of QoC: competition blending.

As a player moves up and down the roster (due to injuries or coaches preference) their QoC changes. At the end of the year we are left with one number to evaluate their QoC but if this roster shuttling has happened, that one number doesn’t represent who they actually played very well.

A good example of the blending problem is Mark Fayne during this past year.  When you look at his overall TOIQoC, he is either 1 or 2 on the Oilers, denoting that he had the toughest matchups.

His overall CF% was also 49.4%, so a reasonable conclusion was that “he held his own against the best”.  Turns out – it wasn’t really true.  He got shredded like coleslaw against the tough matchups.

Down the road, Woodguy (@Woodguy55) and I will show you why this is not really true, and that it is a failing of TOIC as a metric. It tells us how much TOI a player’s average opponent had, but it doesn’t tell us anything more.  We’re left to guess, with the information often pointing us in the wrong direction.

A Malfunction in the Metric

Let’s review what we’ve discussed and found so far:

  • QoC measures as currently used do not show a large differentiation in the competition faced by NHL players. This is often at odds with observed head to head matchups.
  • Even when they do show a difference, they give us no context on how to use that to adjust the varying shot metrics results that we see. Does an increase of 0.5 QoC make up for a 3% Corsi differential between players?  Remember from Part 1 that understanding the context of competition is critical to assessing the performance of the player.  Now we have a number – but it doesn’t really help.
  • The three metrics most commonly used as the basis for QoC are demonstrably poor when used as a standalone measure of ‘quality’ of player.
  • So it should be no surprise that assessments using these QoC measures produce results at odds with observation.
  • Do those odd results reflect reality on the ice, or a malfunction in the metric? Looking in depth at the underlying measures, the principle of GIGO suggests it may very well be the metric that is at fault.

Which leaves us … where?

We know competition is a critical contextual aspect of using shot metrics to evaluate players.

But our current QoC metrics appear to be built on a foundation of sand.

Hockey desperately needs a better competition metric.

Now lest this article seem like one long shrill complaint, or cry for help … it’s not. It’s setting the background for a QoC project that Woodguy and I have been working on for quite some time.

Hopefully we’ll convince you there is an answer to this problem, but it requires approaching QoC in an entirely different way.

Stay tuned!

P.S.

And the next time someone tells you “quality of competition doesn’t matter”, you tell them that “common QoC metrics are built on poor foundational metrics that cannot be used in isolation for measuring the quality of players. Ever hear of GIGO?”

Then drop the mic and walk.

Dig skateboarding? Click the pic and grab this new “Thrasher Magazine” inspired tee!
Click on the pic and grab a new 16-bit Fighting Looch tee!
Click the pic and grab a 16-bit McDavid tee for the summer!
If you’re a fan of Lowetide, you need this shirt! Click the pic and get yours today!

Fancystats Fundamentals and Why Hockey Desperately Needs a Better Competition Metric (Part 1 of 2)

Hockey needs a better competition metric – without it, the value of fancystats for the evaluation of individual players is significantly weakened.

Let me tell you why!

A Background Tutorial

I’m assuming most of the people reading an article with this title are probably familiar with fancystats. But I’m hoping there are a few readers who are a little nervous around fancystats – and I’m hoping I can capture your interest too.

On that note, I’m going to do a bit of a grounds-up tutorial on fancystats to set the stage here in part 1, then get into the meat of discussion on competition in part 2. I hope those of you with lots of knowledge in that arena already will bear with me – or feel free to skip ahead!

Fancystats – Counting the Good and the Bad

Personally, I’m always a bit baffled by the hatred and contempt for the most common shot metric fancystats (the ones with the odd names like Corsi and Fenwick).

Here’s the thing: it’s not like these metrics are measuring anything unrelated to hockey. In fact, they’re measuring something fundamental to hockey, which is shots. No shots, no goals. No goals, no wins. No wins … sucks.

I like to think of it this way. If my team has the puck and is shooting the puck at the other guys net, this is almost without exception a good thing. Sometimes it’s a tiny good thing, sometimes it’s a major good thing, but it’s pretty much always a good thing.

Conversely, if the bad guys have the puck and are shooting it at my net, it is almost without exception a bad thing. Sometimes it’s a tiny bad thing, sometimes it’s a major bad thing, but it’s pretty much always a bad thing.

In the end, these ‘fancy’ stats are not fancy, and they’re not really even stats!  We’re just counting up good things and bad things and seeing whether our team had more or less of those things. The only wrinkle to note is that we focus these counts on even strength (5v5) time. Not that these shot attempt good and bad things don’t matter on the PK or the PP, it’s just that there are other arguably better ways to measure effectiveness on special teams.

We usually express our resulting good/bad count as a percent – 50% means the two were even, 45% means our team is at a 5% deficit in good things, and 55% means our team has a 5% advantage in the good things we counted (the percentages are always expressed from the viewpoint of a specific team).

How do we know this good thing/bad thing has value? Well, there has been a ton of work done by the math guys to show that these things we’re counting have a ton of value in terms of measuring repeatable skill, and in terms of predicting the rest of the season, or the playoffs, or next season. Better than almost anything else we can count, even goals! If you want to get into the details, there is a ton of information out there on Google.

But you can ignore that if you want – at the core, just remember that we’re counting good and bad hockey-related things, and unsurprisingly, good teams tend to have way better counts, because they’re simply better at that whole hockeying thing than are bad teams.

That’s it! You are now conversant with the big bad Corsi! Welcome to the dark side.

Breaking It Down to a Whole Nuvver Level

It’s a slightly more complicated picture when we try and apply this concept to individuals however.

We’re still counting good and bad things as they happen, but now we’re counting them in the context of the five players on the ice. At the end of the [game, series, season], we count up the good and bad things that happened while each player was on the ice, and Bob’s your uncle: player Corsi.

It’s one thing to calculate that number, though. This ‘five player on the ice’ thing means it does get a little tricky though when it comes to using the numbers to evaluate an individual player.

By tricky, I do mean tricky, not impossible. I emphasize this point because there are some folks out there, some even with a massive media platform, who dismiss these stats as exclusively team-level stats, not applicable to individual players.

Unfortunately, this is wrong, and all it demonstrates is a (sometimes profound) lack of knowledge about basic statistics. Modern statistical methods are all about looking at a large mixed dataset and teasing out individual effects.

As it happens, once you get a large volume of these good/bad counts for an individual player, that player will have played with such a large number of teammates rotating through that his number does indeed start to reflect his individual contribution.

It’s why this ‘team level’ stat is almost always different, in many cases radically so, between teammates.

And yet it’s still tricky.  Why?

The Unbearable Difficulty of Context

What makes it tricky is that in order for a player’s Corsi number to make sense, to have it give us a believable gauge of how that individual is doing, we need to understand the context in which that number was generated.

At the individual level in hockey, the most important context is provided by:

  • teammates
  • zone starts
  • competition

They’re important, because each of them drastically affect the count of good/bad things. Again, this is hockey, not stats. Who you play with, where you start, and who you face makes a huge difference to your success.

WOWY, Lookit Those Teammates

Wait a second, sez you, didn’t you just tell me that the teammate issue sorts itself out when you have lots of data?

Well, mostly it does. Those teammates do rotate a lot, and do allow us to get a better picture of the individual. What confounds us is not the other four skaters out on the ice, though, it’s usually just one, maybe two.

Players on the ice tend to have their ice time occur in tandem with one other player, more heavily than anyone else.

For defenders, this should be obvious – that other player is the D partner.

Less obvious is that when you look at time on ice breakdowns, forwards often show much the same pattern. With a few rare and exceptional lines (e.g. The Kane line in CHI), the third player on a line usually changes more often than the other two. Defense pairs rotate too. Maybe injury, maybe the coaches blender, but forward pairs do tend to stand out.

Luckily, we have a tool that helps deal with this scenario. It’s called WOWY – without you with you.

The idea is simple – given two players, Frank and Peter we’ll call them, take a look at how things (our count of good and bad things) went with Frank and Pete on the ice together, then when Frank was on the ice but without Pete, and then when Pete  was on the ice without Frank.

Sometimes you have to dig a bit deeper, such as if Frank and Peter play with radically different levels of skill when apart, like Frank gets Taylor Hall and Peter gets Taylor’s New Jersey roommate, Luke Gazdic.

But usually the two players tend to separate, and you see quality differences quite quickly.

WOWY analysis is always useful when looking at players, and in my opinion, is mandatory when trying to assess defensemen.

Corsi, or the tale of good and bad hockey things. WOWY, the tale of Frank and Peter.

Remember those, and you are well on your way to being a fancystats expert. Well done!

I’m In Da Zone

OK, so we’ve got a handle on teammates.

The second aspect of context we talked about is how the coach uses a player – whether they’re starting in their own zone a lot, or gifted offensive zone time, or neither (or both).

Turns out, this doesn’t matter nearly as much as you’d think, for a few reasons:

  • Most players are not that buried – even “25% offensive zone starts”, which seems like a harsh number, often represents something in the order of 2 O and 6 D faceoffs during a game. Yes it can add up, but it’s still in the end just four zone starts more in the d zone. Not that much in the context of 20 or more shifts per game.
  • Most shifts start on the fly, not with a faceoff. So a players ability drives defensive (or offensive) zone starts to a large extent, not the other way round. Put another way, good players tend to force faceoffs in the o zone, and bad players tend to get stuck in the d zone, and faceoffs (or goals against!) are part and parcel. So good or bad zone starts can be a symptom rather than a cause of good or poor numbers.
  • Faceoff wins are generally around 50% give or take a few points. Think of the four d zone starts from the first bullet point.  Now remember that the two teams are going to split those somewhere between 45 and 55%.  That’s basically two faceoffs that are a problem, in the context of 20 shifts.  Zone start differences diminish rapidly when you start cutting them in half.

Rather than go farther on this topic, I’ll recommend you read this two-part article by Matt Cane:

In summary: there’s reason to believe that zone starts affect a player’s numbers less than you’d think; and when they do – we have an idea of how much, and can adjust for them.

Competition

So of our three critical contextual factors, we’ve talked about two of them: teammate effects (for which we have WOWY), and zone starts (which aren’t a strong as most think, and can be adjusted for in any case).

What about competition?

Well, now things get peachy … by which I mean juicy and somewhat hairy.

Watching games, you can see coaches scrapping to get the right players on the ice against the other teams players.  Checking line to shutdown the big line?  Or go power vs power?  What about getting easy matchups for that second line?  That’s the chess game in hockey, though some coaches are clearly playing checkers.

On-ice competition is a big deal, and a critical part of measuring players. A player with 50% good/bad things is doing great if he’s always facing Sidney Crosby, and incredibly poorly if he’s facing Lauri “Korpse” Korpikoski.

How do we get a handle on that?

We’ll talk in depth about competition and how we (fail to) measure it in Part 2 of this article.

If you’re a fan of Lowetide, you need this shirt! Click the pic and get yours today!