The winner of the Kentucky Derby should be predictable. As with any sort of competition that has standardized practices, there are certain metrics that can be linked to a higher probability of winning. Alongside this, horses competing at the Kentucky Derby are the “best of the best,” so there is reduced variability among metrics compared to most other races. Good news: There have been 143 Kentucky Derbies from which to collect data.
Bad news: I’m only using data from the past 5 so that I can utilize the point system from the Road to the Kentucky Derby that started in 2013.
These points are used alongside the number of prep races a horse ran, post number, expert picks, as well as the ratings given to horses by various outlets in a model to predict the horses’ finish. Running this methodology on a training set of the races that came before a specific derby and testing the model on the derby of interest returns a success rate of 100% for selecting the 1st place winner.
To say that this model is foolproof would be naïve since it is borderline terrible at predicting anything but 1st place finishes. It is also attached to expert picks, which are subject to change under the experts’ self-perceived success rate. Horses also have the potential to be busts despite being favored by several metrics (I’m looking at you, 2017).
Additionally, knowing the outcomes for each Kentucky Derby, I could have inadvertently biased the model. However, to correctly pick the winner 100% of the time, even when the sample size is reduced to only 1 year of data for the training set, speaks to the effectiveness of this methodology to predict the winner. This is substantial considering how much variability there is in horse racing, especially with a small sample size of 5 years.
I’ll be placing a win bet on Good Magic.