Tuesday, February 15, 2011

Nerdin' Out with Numbers Benton

"Numbers Benton" is my official nerd alter ego, in case you were curious.


He was born Alexander Benton in Provo, Utah, 1971. His father was a railroad engineer and his mother made dignified coats for sailors. By age 7, he was regularly defeating calculators in tests of computing skill and wit, and by age 10 he appeared on the Johnny Carson show to do his 'Ten Toes Twenty Fingers' trick- an optical illusion supported by math. He was soon touring the country with his partner "Phineas the Physicist," and in 1985 he legally changed his first name to "Numbers" for marketing purposes. He was married in 1989, divorced in 1991, and became a Kabbalist soon thereafter. He went into hiding for much of the mid-90s, emerging in 1998 only after he'd come up with a formula to prove that rich kids could naturally roller skate better than poor ones. The backlash was so severe that Benton was driven to Marxism.

He released a math-themed musical album in 2001 ("Comfortably Number"), re-married in 2004, then divorced again in 2005. For the last five years he's been obsessively trying to determine the exact score of every college basketball game using complex calculus (and sometimes simple algebra). He finally had success three years ago, when he successfully predicted San Diego State's 73-62 victory over Northern Colorado. Since then, it's been one hard luck story after another. Earlier this year, he entered rehab for a chemical dependency on the kind of glue that holds most calculators together. He had to re-train himself to use entirely organic calculators, a process that began with heavy reliance on an abacus.

But now he's back. And he's ready to take us on a guided tour through Statsheet.com, a sweet site that employs metrics to give us all a better idea of the factors that influence a basketball game.

To be serious for a moment: the site really is great, and it's worth learning more about. Ken Pomeroy, whose work I pimp here about one of every two days, uses these stats. And today's post is as much for me as anyone else; there are plenty of gray (and sometimes completely blank) spots in my knowledge. I'm hoping to learn as I write. And I'm hoping I make at least one mistake that NastyEmu or some other genius out there can correct.

For our purposes, we'll be using the box score from Duke's 81-71 win over Miami as our example. You can view it here, though I'll be posting the charts as we go along.

(mist, eerie music, distant laughter) NOW JOIN ME, WON'T YOU, AS WE TAKE...


A GUIDED TOUR THROUGH A STAT SHEET BOX SCORE


First things first. I'll honor your intelligence and skip the basic stats. We all know about points and rebounds and assists and "Plumlees," which is when a player falls over and cries on the court. We'll skip those and move to the good stuff. The very first stats you see on the page are the game stats. Above the table, you can choose basic or advanced. Go to advanced, and it looks like this:


Their table has game splits, so you see total stats, a break-down by halves, and game averages. But this is all I can embed. And it's not nearly as pretty, so make sure you look at the website too. Let's take a quick gander at what we see, and why it's valuable.

1. Efficiency: This is each team's offensive efficiency. It's kind of like points per game, but way better. The calculation is easy; this number is how many points the team scored for every 100 possessions. You figure it out by finding points per possession and multiplying by 100- basic. Here's the advantage of efficiency over points per game, and hopefully I can explain this without resorting to hieroglyphs: points per game does not take tempo into account.

Wisconsin is a perfect example. They average 69.6 points per game, which is 161st in Division 1. So they must be pretty damn mediocre on offense, right? WRONG, FUCKFACE.

Whoa, sorry, my tone got a little extreme. My bad. But Wisconsin is actually very, very good on offense. In fact, they're the best: in offensive efficiency, they rank 118.9, tops in Division 1. Again, that means that for every 100 possessions, they score 118.9 points. The reason their 69.6 points per game is pretty low is that they play a very slow temp. But what does that slow tempo mean? It means their opponent is also playing a slow tempo. Which means fewer possessions for everyone. When Wisconsin scores 1.189 points per possession, it puts them in very, very good shape to win a slow game. Obviously, points per game alone gives you a pretty poor idea of how good Wisconsin is on offense.

As we see here, Duke's efficiency against Miami was pretty high at 124.6. They only had 65 possessions, but if they'd had 100, that would have been their score. Except I'm not sure how they would have gotten six tenths of a point, but I know it would have happened; stats are never wrong.

2. Defensive Efficiency: In this chart, you can glean this number just by reading the other team's offensive efficiency. Since Miami's EFF was 109.2, that was Duke's defensive efficiency; they allowed 109.2 points per 100 possessions. But you can also find season stats on KenPom's website. The top ten defensive efficiency leaders are as follows: Texas (82.3), Florida State (86.0), Maryland (86.3), North Carolina (86.5), Ohio State (86.9). Duke is 9th in the country at 87.8, so we can see that Miami had a pretty solid offensive showing against the Devils compared to Duke's other opponents.

Just for fun: Duke had about 38 possessions in the second half against North Carolina. They scored 50 points. Total efficiency? 131.6. Compare that to UNC's usual awesome rate of 86.5. That's quite a friggin' half.

3. FR%: This one's easy - floor percentage. It's simply what percentage of possessions ended up with the team scoring at least one point. Duke scored on 60.0% of possessions, compared to a season average of 54.7%. They allowed Miami to score 54.5% of the time, which is one point higher than Miami's usual offensive total and 6 points higher than Duke's usual defensive allowance. This stat, I would say, is just useful for a quick look, and isn't too too valuable beyond that.

4. Free Throw Rate: How often a team scores from the line. This incorporates both how often they get to the line, and how efficient they are once they get there. The calculation here is: FTM/(FGA + TO + (0.44 x FTA)). Looks complex, I know, but here's what's basically happening: total made foul shots are being divided by total possessions (field goal attempts plus turnovers plus roughly half of the number of times a team gets to the line- it's .44 because sometimes there's an and-one situation, and apparently that happens 12% of the time) to get a sense of how good a team is at scoring from the line.

Duke was very average this game at 37.5 (compared to 38.6 normally), but they held Miami to an amazing 9.4 (their usual average is 40.6). By keeping Miami away from the line (and, to be fair, Miami helped by missing once they got there), Duke gave themselves a huge advantage. This stat is one of the "four factors," which means that stat experts think it's one of the four most important numbers in any given game

5. FG, FT, and 3-point Distribution: Very simply, what percentage of total points each team scored from each part of the floor; 2 points, foul shots, 3 points. For each team, these will obviously add up to 100%. Against Miami, Duke was basically right on its season numbers. 49.4 from 2, 21.0 from the foul line, and 29.6 from three...never more than 2 percent from any season average. Miami, as you might guess from the last stat, was heavily imbalanced. 59.2 from 2, only 7.0 from the foul line (compared to a 22.4 average), and 33.8 from three.

That tells us that although we know Miami was pretty efficient on offense, Duke didn't foul very often, and it took away free points they usually convert at a higher clip. This stat can also show when a team becomes over-reliant on the three-pointer, or can't score from inside.

6. Assist Percentage: How many field goals were assisted? In a lot of ways, this is a measure of the quality of a team's offensive play. Were they moving the ball around and creating open shots, or were they relying on one-on-one tactics for scoring? Duke's percentage against Miami was 60.7, compared to an average of 52.3 Miami's was 55.2, compared to 47.8 normally. Again, we see that both offenses were thriving in the halfcourt set.

7. Turnover Percentage, Steal Percentage, Block Percentage: Again, easy. On what percentage of possessions did you turn it over? Get a steal? Get a block? Very useful numbers for game and player analysis. The only numbers that really stick out here are Miami's steal percentage (much lower than normal) and Duke's block percentage (again, low).

Okay, now for the charts. First, the "Four Factors." I went over this yesterday, but it'll be good to have it all in one place.


The following is pasted from yesterday's post:

1) Effective FG% - This is like normal field goal percentage, except made three-pointers are weighted correctly at 1.5 times a two-pointer.

2) Turnover % - What percentage of possessions resulted in an offensive turnover.

3) Offensive Rebounding % - Same deal, what percent of missed shots produced an offensive board. Offensive Rebounds / (Offensive Rebounds + Opponent Defensive Rebounds).

4) FT Rate - How often did you get to the line, and how good were you once there?

The four factors are weighted in terms of importance as follows: Shooting (40%), Taking Care of the Ball (25%), Offensive Rebounding (20%), and Getting to the Line (15%).

So. Against Miami, Duke's effective field goal percentage was 57.1%, a bit higher than their 54.7% average, which is already 14th-best in the country. They typically allow opponents a 45.0% rate, but Miami did well to shoot 51.6%. Turnovers and rebounding were almost identical, but Duke got to the line a lot more than the Canes. The first and last of the four factors made the difference, and it jives with the scouting report before the game- Miami isn't great on D.

Okay, back to the present. Next chart:


This isn't telling us anything we didn't already know from the box score, but the presentation is great; it allows to more easily see the disparities. Here, Duke's enormous advantage in free throws and steals is readily evident, while we can also see interesting paradoxes, like the fact that Miami scored one more field goal than Duke.


Here's the game flow chart, along with a line graph showing how mathematically safe Duke's lead was at every given point of the second half. Pretty sweet.

After the individual player stats, which follow the same format as the team stats above, there are two more charts- player impact numbers for both teams. Let's take a look at Duke's:


Again, we're not getting new information, necessarily, but the visual presentation is pretty helfpul. It lets us see how each player filled the scoreboard in various categories. Looking at the Duke chart for the Miami win, a few things stick out. First, Tyler Thornton's foul shots; all in garbage time. Next, Seth Curry's threes. Next, the low number of Mason Plumlee's defensive boards- just 1. It'd be easy to miss that in a box score, but here's it's highly evident. For my money, though, the best part is Curry's steals. I happened to notice this in the box score because I was looking for it, remembering how great he was on defense, but if I missed the game or wasn't paying attention, that would alert me to an interesting individual trend.

And this...well, this is fantastic:


Ref stats! Hey Kitts! Corbin! Greene! Get the whistle off your chest, assholes! EVERY SINGLE ONE OF YOU IS CALLING FOULS AT A LOWER RATE THAN YOUR AVERAGE SINCE 1996! CONSPIRACY! CONSPIRACY! I'VE GOT THE NUMBERS RIGHT HERE!

The other ref stats are unembeddable, but on the page you'll see they have a complete history from 1996, and each ref's stats for the season. In 56 games, Greene has called more technicals than Kitts has called in 74. Dude's got a quick trigger finger.

Okay, that's it for today. Hopefully this was at least somewhat useful and not totally boring. One way or another, you may never see Numbers Benton again.

17 comments:

  1. Nicely done.

    I believe William Avery is the ACC's all-time leader in Plumlees.

    ReplyDelete
  2. My 13 yr-old and I did some geeking out last night thanks to your link to this site. Today's post was a nice review (didn't catch Ref stats - cool!). Sites like this will be a fan game-changer (have you compared '01 Duke to '09 UNC yet? Even better (but worse result), '99 Duke to '09 Heels)? Conclusion: '09 Heels were 'friggin awesum... '99 Duke was awesummer (even if they did choke the Final).

    ReplyDelete
  3. Numbers Benton is a typical stat geek; he uses too much information and in the end says nothing. I prefer the straightforward prose of Dr. Numbers. Now that's a man I could relate to. WOLLA WOLLA WOLLA

    -Nick E

    ReplyDelete
  4. I'm so ready for baseball season now

    ReplyDelete
  5. Nasty, I think I might be wrong on #4, free throw rate. I think it actually might just be free throws attempted divided by field goals attempted. And ditto on baseball.

    Oh man Nick, I almost forgot about Dr. Numbers! Here he is, for newcomers, bottom of the post:

    http://sethcurrysavesduke.blogspot.com/2009/09/folks-get-ready.html

    -Shane

    ReplyDelete
  6. To be honest, I started reading this post and it reminded me of the hilarious glossary at FJM so I went and read that for the hundredth time.

    http://www.firejoemorgan.com/2005/04/glossary-of-terms.html

    ReplyDelete
  7. It looks like KenPom uses FTA/FGA to calculate FT Rate, while StatSheet uses the formula you mentioned in the post.

    ReplyDelete
  8. Yeah, that's what confused me. I think both stats are useful, but statsheet has the overall better metric. KenPom's is good to see how often a team gets to the line in a vacuum, but the SS stat incorporates their performance there. On the other hand, I'm still not convinced that the FTRate thing I talked about in the post is what's actually used on the advanced stats by them. I'm going to e-mail and find out.

    -Shane

    ReplyDelete
  9. What I don't understand is why they use that complex formula to calculate the # of possessions when that's already a known value.

    ReplyDelete
  10. Because possessions doesn't work for them...what they need is potential foul shot "tries."

    For example, if a team gets 6 straight offensive boards on missed shots, that would still only count as one possession, but for the purposes of free throw rate, it's actually 6 potential chances for a team to get to the free throw line. So possessions doesn't cut the mustard, because it under-measures.

    The three things that up to total "tries," or chances when you could have gone to the foul line are:

    1) shots attempted
    2) turnovers (you had a chance)
    3) successful free throw line visits, measured in attempts

    For the last one, you need to cut it about in half, since typically you're taking 2 foul shots at once. But not exactly in half, since sometimes a successful shot and a foul shot happen at once in the and-1 situation. Apparently they've done their studies and determined that the perfect number for multiplication is 0.44, which means that 88% of foul line visits are for 2 shots, and 12% are for just one.

    -Shane

    ReplyDelete
  11. Apparently possessions still isn't an official stat so they have to estimate it with that formula.

    Also, as of the last update, KenPom uses .475 as the multiplier.

    ReplyDelete
  12. No, read my post above yours Nasty, possessions wouldn't work.

    -Shane

    ReplyDelete
  13. Maybe I'm missing something, but doesn't FGA + TO + Trips(not attempts) to the foul line = Possessions

    ReplyDelete
  14. No. You can have unlimited FGA on one possession with offensive rebounds. Six missed shots, plus six straight offensive boards, plus one turnover = one possession.

    -Shane

    ReplyDelete
  15. I didn't realize that today was the SCSD nerd convention AND Phil Collins Day (yes, seriously).

    http://gothamist.com/2011/02/11/celebrate_phil_collins_day_with_a_p.php

    It is like a national holiday.

    ReplyDelete
  16. Way to bring the contentious commenters of yesterday together today over data and Jackson Browne. You're a real peace broker, Benton.

    ReplyDelete