How Well Does Daytona Predict A Driver’s Season?

The Meaning of Daytona

Ah… the optimism of a new season.

Everyone is unbeaten.

Everyone is tied for first place.

Everyone has a shot at the championship.

Then Speedweeks starts.

NASCAR doesn’t ease into the season the way, say, college football does. You get two warm ups — three if you’re in the Clash — and then we’re into the Daytona 500. A Daytona win gets you a place in The Chase and a place in history. But does it mean this is your championship season?

Correlations: What Are We Looking For?

I was interested in whether a driver’s finish was in any way correlated to his or her ranking at the end of the year. In other words, could you predict whether a driver would have a good season based on his or her finish at Daytona?

If Daytona were perfectly predictive, then the driver who finished first at Daytona would win the championship, the driver who finished second would finish the season in second place, and so on. A plot of the driver’s season-ending rank vs. the race finish would look like this for a perfect correlation.


If we actually saw this in the real data, the only interpretation would be that NASCAR is fixed. (And we know it’s not, so we expect so scatter.) If there were a strong correlation, we’d more likely see something like this:


There’s a clear overall trend, even though there’s some scatter in the data. The more scatter, the less correlation and the less predictive ability.

One more practice run. Here’s the same data as above, but I’ve introduced some luck — good and bad. In this simulation, some of the top season finishers wrecked out of the race and a few of the backmarkers got lucky.

The overall strong trend is still there, but a couple data points (the ones highlighted) are way off the line. The data points below and to the right of the line are drivers who perform worse than their season-ending rank would predict. For example, the season champion gets caught up in an accident early and finishes dead last. The data points to the left and above the line are drivers who are over performing.

Does the Daytona 500 Determine Your Season?

I split the data up to consider drivers that didn’t finish the race. The blue circles represent cars running at the end of the race and the orange squares are the DNFs.

As I described this to my husband, it looks like someone took 40 data points and threw them at the chart.

In other words, where you finish in the Daytona 500 has no correlation to your overall season.

I’ve written before about the inherent randomness of restrictor plate races.

I suggest that plate races are as much about luck as they are about driving skill. I decided to test that assertion by looking at all four restrictor-plate races on one graph.

The circles are the 500, the triangles are the July race and…

Well, it doesn’t really matter, does it? If you looked in the dictionary for the word ‘random’, you’d find this graph.

None of the plate races races show any correlation with a drivers’s season-long ranking, thus you’d be hard pressed to argue that how a driver finishes at the plate races has much of an impact on his or her season, aside from the obvious implications of getting him or her into The Chase.

But Do Any Races?

A driver’s season is 36 races, so you might wonder if we should expect any of the races to be correlated to the driver’s final rank. Every driver in the top 10 has finishes in the bottom 10 over the course of a season.

Does any single race correlate with a driver’s season?

1.5 Mile Tracks

You would expect that, since 1.5 mile tracks make up the largest number of races run, that the 1.5 milers might be most likely to correlate.

You would be right. But some tracks correlate better than others.

Texas

I’d say Texas is average. (Texans, of course, will disagree.) I took the liberty of drawing a line to show you the trend. (No, I didn’t do a least-squares fit. I just eyeballed it.)

Some 1.5 milers show more correlation, some less.

Charlotte (Coca-Cola 600)


You can see the effect of the larger number of accidents here. The correlation only holds over the drivers who finish, but there’s a little less scatter here than in Texas.

Vegas, Baby

I didn’t do least-squares fits to test this assertion, because frankly, there aren’t enough hours in the day, but I’d argue that Las Vegas is probably the 1.5 mile track with the best predictive ability.  There’s not much scatter.

 

There are, however, some obvious outliers and it’s good to check and make sure we understand why those data points are outliers.  I labeled the four drivers who fall far from the line.

  • Kyle Busch finished on the lead lap, but he was the last car on the lap. He had a pit road speeding penalty on lap 125/267
  • Austin Dillon finished a lap down after getting a flat RF tire on lap 132 and another tire go down on the last lap.
  • Kurt Busch finished 4 laps down and had an electrical issue that made him turn off most of the fans in the car. He had to replace batteries during a green flag on lap 201
  • Ricky Stenhouse Jr. finished 6 laps down and was in a backup car.

Everyone else fell pretty much on the line.

Non-1.5-Mile Tracks

But, you say, of course there’ s a correlation. Most of the season is 1.5-mile tracks, right? It turns out many of the other tracks show similar correlation. Let’s go to the data

Pocono (Spring)

This one’s sort of interesting. There’s some scatter around the top finishers (a couple mid-packers had really good finishes), but the correlation once you get past about 10th place is pretty good.

Interesting note about the DNF drivers: The four lower red dots represent: Kahne, Earnhardt Jr, McMurray and Johnson. The three upper ones that lie along the line are Whitt, DiBenedetto and Jeffrey Earnhardt. When lower-ranking drivers DNF, it still fits the trend because they end the season toward the lower end of the rankings.

Bristol

Bristol is the definition of chaos, so you’d expect the graph for Bristol to look more like Daytona, right? The graph below combines spring and fall races, using . (Fall is squares and Spring is circles.)

Yes, there is a bit of scatter, but it’s surprisingly correlated.

Others Tracks Showing Correlations

Other tracks that show decent correlation: Fontana, Phoenix, and Michigan.

What Tracks Aren’t Correlated?

Nothing is as random as the plate tracks, but some of the other tracks show varying degrees of non-correlatedness.

Road Courses

This shouldn’t surprise you. Even though NASCAR drivers are (as a group) much better road racers than they were in years past, they don’t get a lot of practice given we only run two road-course races per year.  If you squint, you might claim there’s a trend, but you have to squint pretty hard.

The road course results look horrendous compared to Las Vegas, but they’re nothing compared to the plate tracks.

Martinsville

Darlington, Dover and Indy are also rather random.

So Daytona Doesn’t Predict the Rest of the Season?

Not so fast.

The giant “but” here is that drivers are human. (Although if you catch a crew chief in bad mood, he or she may argue the point.)

Some drivers are good at moving on from disaster. Others stew about and internalize every mistake. They can’t put the last negative behind them and it take their mind off what they’re doing now, so they make more errors. Think about the driver who has a horrible pit stop. What are the first words he says over the radio? “It’s okay guys, we can make this up” or “Y’all are morons”?

This is the great thing about sports. If we could predict the outcomes based on math, why watch? It’s the unpredictable that makes sports interesting and most of that unpredictability comes from the human element: the driver who gets up on the wheel and takes a chance; the crew member who forgot to make sure the ballast was secured; the crew chief who talks his driver down from the ledge after that crappy pit stop.

So while statistically, you shouldn’t worry if your driver has a bad Daytona, you might take into consideration how resilient your team really is. I’ll take a resilient team that has the occasional screw up over the one that’s perfect until there’s a mistake and then everything falls apart.

Does Winning the Clash Give You An Advantage in the Daytona 500?

While thinking about predictive ability, I started to wonder whether there was any correlation between winning the Clash (or Equivalent) and winning the Daytona 500. I analyzed data from 2002-2017. This histogram shows how Clash winners placed in the Daytona 500.

The winner of the Clash won the Daytona 500 once in 16 years (6.25%), but came in 2, 3, 4 or 5 four times (25%).

Note, however, that you have an equal chance (25%) of coming in 2, 3, 4 or 5 as you do coming in 36th or higher.

What About Winning the Duels?

Same thing here. I compare the 32 Duel winners from 2002-2017 with how they finished in the Daytona 500.

Only two drivers won their duels and went on to win the Daytona 500. As with winning the Clash, a Duel winner has same chance (21.9%) of coming in 2, 3, 4 or 5 as coming in 36th or higher.

TL;DR

  • How a driver finishes at Daytona — or any plate race — has no correlation with his or her final ranking
  • Some tracks do show correlations (of varying degrees) with final position.
    • All of the 1.5 milers show a correlation, with Vegas being the strongest in my 2017 data set
    • Pocono, Bristol, Fontana, Phoenix and Michigan also show good correlations
  • Some tracks show much less of a correlation
    • Road Courses
    • Martinsville
    • Indy
    • Dover
    • Darlington
  • Winning the Clash or the Duels doesn’t indicate you’ve got a better show at winning the Daytona 500

Also published on Medium.

4 thoughts on “How Well Does Daytona Predict A Driver’s Season?”

    1. I try to pick the things that surprise me. I hadn’t expected there to be any correlations at all. Once I saw the correlations for the 1.5 mile tracks, I thought “Well, of course. So many tracks are 1.5 miles.” Then Pocono was correlated, too. Thank so much for taking the time to comment. DLP

  1. I knew those RP races were basically crapshoots. Thanks for proving it graphically.
    I’d bet if the wave around entitlement program wasn’t so prevalent, that the correlation between finishing order and final standings would be much more in line for all tracks except the RP and road course tracks. Those free laps back allow drivers that normally finish at the bottom to get better finishes than they deserve (based on how good their car was).

    1. That’s a really interesting observation about the wave around. I wonder if going back and looking at the data before the wavearound was implemented would yield different results. (I’ll have to put that on the to-do list. So much data and so little time…) Thank you for reading!

Leave a Reply

%d bloggers like this: