The impact of hole variation on scoring
When professor Mark Broadie first conceptualised the strokes gained statistic, a now-ubiquitous metric in the sport, distance to the pin was one of the two independent variables from which the number of expected strokes was calculated. As theorised by Prof. Broadie, the further the distance to the pin (all things remaining equal – namely lie quality), the higher the expected number of strokes the player will make to complete the hole. However, it was discovered that this relationship between the remaining distance to the pin and expected number of strokes is not directly proportional. Let’s look at how the widespread changes in hole length on the European Tour influence the scoring statistics of professional golfers.
For this analysis it was crucial to separate the holes into their corresponding par groups (i.e. par 3’s, par 4’s and par 5’s) before calculating scoring probabilities based on changes in hole length. Below we have plotted the scoring average of every round on the European Tour as a factor of the hole distance (in yards).
The plots above represent the scoring average of every round and hole played on the European Tour since the beginning of 2017, showing the general trends in scoring for par 3’s, 4’s and 5’s. As mentioned previously, distance plays a key role in scoring, with a very strong correlation between hole length and average score to par.
Although the majority of occurrences coalesce around these trendlines, it is clear that a number of incidents defy these general patterns and stand as outliers. The reason that these holes have a recorded scoring average significantly different from the expected scoring average is due to a number of external factors that have influenced the difficulty of the hole so to outweigh the expected effect of distance.
By extracting a few examples of the outliers, we can examine the exact cause of the discrepancies in scoring average. An example of this is the par 3 5th hole at Al Mouj Golf Course played during the Oman Open 2019, which recorded a scoring average of 3.26 strokes during round 2.
On paper, this short 150 yard hole should play fairly easy (expected scoring average = 2.90 strokes), but after looking at the hole design and weather conditions it was clear why this hole played so difficult. The hole in question is an island green par 3, surrounded by a water hazard, explaining why the hole plays more difficult than average. In addition to that, the second round was impacted by a sandstorm with winds so strong (gusts up to 63 km/h) that all golf was eventually suspended.
Instances like this highlight the importance of calculating round and hole-level adjustment values for raw strokes gained baselines. This helps to account for any abnormal variations in scoring due to hole layout and/or weather conditions. When making these adjustments to the static strokes gained baseline, it is important that the baseline itself is modelled accurately.
The plots above show how different regression models (linear vs. lowess) can have different outcomes – and therefore an inaccurate model may ignore subtle changes in scoring averages.
A good example of this can be seen when modelling the par 5 scoring averages, where there is a sudden but obvious increase in scoring average between 550 and 575 yards. The average driving distance on the European Tour in 2019 was 297 yards, for which a 550 yard hole would leave you with 253 yards remaining to the pin and a 575 yard hole would leave you with 278 yards to the pin. At this distance, the majority of European Tour professionals will have difficulty reaching the green with their second shots, leading to a significant reduction in the percentage of players that hit the green under regulation. This will cause a significant drop in the number of players that make birdie or better, increasing the scoring average as a result.
When calculating scoring average statistics it was important to take into account the differences in player ability for varying event fields. To do this, we use a strength of field variable (based on the 15th Club performance index player values prior to the event) to make hole-level adjustments for the scoring average. This should account for any event-to-event variations in player ability and will allow for the comparison between hole scoring averages across events.
The plot above shows how the probability of making a birdie or better and bogey or worse changes as hole length increases. Par 3 and par 4 holes seem to follow a very similar trend in terms of birdie and bogey probabilities, with both seeing a fairly linear increase in bogey probability with distance and a more exponential decay in bridie probability. Par 5’s never experience an overlap between birdie and bogey probabilities, with the two growing ever closer with increases in distance.
As expected, the average scoring variation of holes increases as the par of the holes increases. Interestingly, both par 3 and par 4 holes seem to have a u-shaped trend, with scoring variation seeing an initial drop as hole length increases only to rise again after the scoring variation low was achieved. The point of lowest scoring variation seems to coincide almost perfectly with the point at which the average scoring on the holes reaches equilibrium (scoring average = 0 & birdie/bogey probability are equal).
Conversely, par 5 scoring variation sees the opposite trend (n-shaped trend), seeing an increase in scoring variation initially only to fall after it reaches its peak at a hole length of ~550 yards.
It would be expected that the scoring variation for par 5’s would regress to a natural low, coinciding with the point at which scoring reaches equilibrium (scoring average = 0 & birdie/bogey probability are equal), as seen with par 3’ and par 4’s. When assuming a linear regression (fairly unlikely), it can be predicted that the hole length required for par 5’s to have an average score of level-par would be in the region of 700 yards.
So what does this mean for overall scoring? The short of it is that hole distance has a different influence on holes of differing pars. Looking at par 3’and 4’s, we see the largest variation in scoring average for holes that are either unusually long or unusually short. The reason being, by enlarge, either everyone can reach and the challenge is accuracy (think the postage stamp at Troon) or it’s a monster for everyone (think No.2 at Shinnecock Hills during the 2018 U.S. Open which stretched to 252 yards). Excluding the extremities (the longest 1% of players), a hole of this length toys with the finest of margins, even for tour professionals. Only from flirting with those extremities do we see the largest variance in scoring.
However, with par 5’s the opposite can be said. Shorter and longer holes tend to have the most consistent scoring. It’s the middle-ground, the 550-575 yarders, that causes the biggest variation in scoring. Again, the key factor here – risk. Be it a shorter hole, for the majority of players, either getting home in two or setting up for an easy up and down becomes fairly simple for the average tour pro. Similarly, the monster par 5’s (think 9th hole at Green Eagle GC, Hamberg, Germany – a sweet 647 yards) are, excluding the freakishly long, unattainable in two (Only 13 from 1134 attempts have been successful in hitting the green under regulation). Therefore the risk of going for it in two is removed, we fall back to a three shotter and a more consistent scoring pattern.
Only with smart course design do we find the largest variance in scoring and the most rounded golfer, be that physically and/or mentally, rising to the top. Through drawing the golfer in to risk, are we rewarded with the best player over a 72-hole tournament.