The Delusions Of College Football
Previous How They Drew It Up: Screens Edition
John Hollinger’s PPR gave us the template and we improved it
Preface: Will Muckian and Joseph Nation played integral roles in the creation of this statistic. They also provided contributions to this article.
Will Muckian: The Introduction
Part of me has always been fascinated by John Hollinger, a classic example of a man trying too hard to create data points from nothing. Hollinger’s rise and fall as the NBA’s advanced statistics darling is an article in and of itself, but just because his work was flawed does not mean it’s unsalvageable.
Pure Point Rating is the perfect example. There’s a very good idea at the heart of this stat: it aims to create a more accurate indicator of playmaking prowess beyond your bread-and-butter AST/TO ratio. It puts together some more readily available stats (I know number nerds are tenting their pants at the usage of per-100 data) and spits out one nice shiny number that, for all intents and purposes, should be the final word on a player’s ability to efficiently create chances for his teammates. Of course, the problem starts as soon as you delve into Hollinger’s formula:
100 x (League Pace / Team Pace) x ([(Assists x 2/3) – Turnovers] / Minutes)
Most of the structure is fine — pace calculation, minute load consideration, and factoring in the negative effect of turnovers via reducing the value on an assist. But here is the issue: where does the two-thirds value come from? Is it some arbitrary value or is there some legitimacy to it?
Well, I may be a writer who deals with stats, but redrawing the boundaries on an advanced statistic stretches well beyond the limited calculus I stumbled through in high school. For that, we have Holyfield’s math guys, Joseph and Drew. So, I took the question to them: is there a way to determine what the proper coefficient should be for the formula? For all we know (at the time), Hollinger may have gotten the figure correct.
Drew and Joseph responded, “Of course, Will. We just need to run a regression using a dependent variable like RAPM and determine what variables we need to control for.” That’s exactly what we did, and Joseph has more on that.
Joseph Nation: The Methodology & Results
In order to develop the model, we began with Hollinger’s Pure Point Rating formula. In plain language, the formula (see above) takes the value of a player’s assists less the negative value of their turnovers and multiplies that by a series of factors to adjust for different minute volumes and different paces. Within this, the biggest thing we sought to fix was the outdated two-thirds coefficient on assists, which is supposedly based on Hollinger making stuff up from feel. Furthermore, other estimates of the relative value of an assist and turnover, such as Kevin Ferrigan’s Daily RAPM estimate, found significantly different coefficients and have much better mathematical grounding. For example, Ferrigan’s results translate to approximately a coefficient of 0.18 for assists.
We then took a similar method to DRE, using instead a single year RAPM sample across 14 years as the dependent variable in a regression. This differs from the 14-year RAPM used for DRE. Despite that figure is more stable, it doesn’t allow us to capture the changes players go through across their careers — hence single year RAPM. This, unlike aiming for the more intuitive win percentage dependent variable, allows us to more successfully individualize the data. Regressing to win percentage, as shown by Wins Produced, results often in attributing team factors to the individual and hurts the ability of statistics to have meaning going forward.
The results of our OLS robust regression can be seen above. We did retain some statistically insignificant things in the model because by and large, those are things that can be assumed to be on account of omitted variable bias. Since those also interacted with the terms on assists and turnovers when removed, we opted to include them in order to keep their effect on the two meaningful terms as included as possible. Similarly, we included no non-linearities because they appeared to overfit the model (controlling for too much when it isn’t needed), and didn’t have too much of an impact on the two relevant numbers.
It’s also worth discussing the inclusion of minutes. This is a technique established in Daniel Myers’ Box Plus-Minus to control for quality of opponents. Starters often play against a higher average quality of opponent, and the only thing in the box score that really tells us that is how many minutes a player is playing. While the logic is a bit extended, it does make sense and does work effectively in a majority of cases.
From these results, we took the respective coefficients for assists and turnovers and normalized them to set the assists coefficient to one. This is done by dividing both coefficients by the original assists coefficient from the regression results (0.15984 / 0.15984 and -0.26318 / 0.15984). This is the second major change from Hollinger’s methodology, which normalized turnovers to one. We chose this because people tend to be much more aware of the approximate value of an assist than they are the cost of a turnover. This gives us a coefficient of -1.65 on turnovers. Hollinger’s estimate, when normalized the way we did, gives you a coefficient of -1.5, a figure significantly too low. This represents a potential 10 percent difference in a player’s rating, which makes it a big deal to improve it in this manner. This normalization does increase the scale slightly — the coefficients are larger in magnitude — but for comparison internal to the stat, it allows you to much more realistically translate the stat to the real world because it gives it units of assists per 100 possessions, while also maintaining internal comparisons cleanly.
Drew Steele: League Leaders
The new and improved formula for what we are calling hPPR (the ‘h’ is for Holyfield, obviously) is as follows:
100 x (League Pace / Team Pace) x (Assists – [Turnovers x 1.65]) / Minutes)
Though the change may seem subtle in that Hollinger’s two-thirds figure (0.66667) isn’t that far off from our normalized assist figure (0.60734), as Joseph outlined, the difference is significant enough to affect the results. Furthermore, striving for accuracy and testing the results of a study is, you know, a foundational element of scientific principles.
Yet, I’m going to avoid hopping onto that soapbox and give you guys what you really want: the results. Who are the “best playmakers in the NBA” according to hPPR? Below is a table with the results as of November 28, 2017, when the player and team data was collected to build the table. Along with the results, I created a basic percentile to provide additional context of what a “good” or “bad” hPPR figure is. It’s in increments of five, meaning the top percentile is the 95th, followed by the 90th, then the 85th, and so forth.
With the filter of at least 100 total minutes played (sorry Jack Cooley), the top 10 is a cast of players most of you did not except. I know I certainly wasn’t expecting Shelvin Mack to be second on this list. Sometimes, you just have to love how early season results affect the rankings. As the season progresses, you will see the numbers begin to stabilze as well as see more recognizable names. You should have seen some of the names in the preliminary stages of this article. With that said, there is some sort of pattern in terms of who appears at the top of the list.
The table above details the league leaders in hPPR from last season, with a filter of at least 20 played games. You see the likes of Chris Paul, John Wall, and Ricky Rubio in the 95th percentile, but you also see players like TJ McConnell and Ish Smith also atop both lists in either the 95th or 90th percentiles. Where are the guys like LeBron James, Russell Westbrook, and James Harden? They are also elite playmakers, aren’t they?
Due to the larger coefficient on turnovers than before, these primary ball handlers who not only have to facilitate the offense via their passing but also be the first scoring option on offense are penalized noticeably on their higher turnover figures. The more you have the ball in your hands, the greater chance there is for turnovers. And despite that, players like James Harden and Russell Westbrook are in the 85th percentile and LeBron James is in the 80th percentile. Gifted playmakers do shine through, but they aren’t going to be in the top 10 due to the nature of the formula. The McConnell’s and Smith’s, on the contrary, are more cautious with the ball, leading to fewer turnovers and a larger hPPR figure.
Will Muckian: Conclusion
More than anything else, the best way to use hPPR is to determine what players run their offense at minimal cost to offensive possessions; it does not determine the skill level of a passer as much as it indicates their ability to make a smart pass. No one in their right mind would argue that Jerian Grant is a more gifted playmaker than LeBron, but Grant consistently makes passes that avoid costly turnovers, whereas James, as the fulcrum of Cleveland’s offense, is almost forced to make risky passes in order to generate scoring for his teammates. Like any stat, hPPR isn’t a tell-all. Instead, it’s meant to be taken alongside and in the context of other statistics.