A recent tweet from Twitter user “Vansh” highlighted an often ignored problem in the tennis stats industry. Check out the discrepancies from the Monte Carlo final between Holger Rune and Andrey Rublev below. The first screenshot is from the TennisTV broadcast that Robbie Koenig backed as very reliable:
The second is from Infosys, which partners with the ATP organisation:
Such a wide margin of difference in the unforced error variable between the two companies highlights the problem tennis has with a term like “unforced error”. What is an unforced error? The short answer is “it depends” but it is a problem when analysts (including yours truly) rely on such data points when breaking down matches and crafting a narrative around why such and such won or lost. In the past I have statted my own matches using metrics that accounted for whether the player was moving or not when they made an error, but this still doesn’t capture a lot of the nuance. Craig O’Shannessy argues that there really aren’t any unforced errors:
“But next to nothing happens in a match without some form of pressure. A player marginally leans to the left in the middle of a point, and the opponent catches it in his or her peripheral vision and changes a shot at the last second seeking a strategic advantage. That is labeled an unforced error. It’s not. There is always something forcing or influencing shot selection. Or say a player struggles with slow, high balls in the middle of the court. They look easy, but they are missed a lot because players make contact high, out of their strike zone, and there is no power to work with. Is that an unforced error, or just smart strategy by the opponent?”
But if we then marked every miss simply as an error it still wouldn’t tell us a lot; it could be very misleading. I am thinking of doing my own stats again for match analyses and then comparing with the TennisTV and Infosys numbers to see what differences we get. Below are some scenarios that explain how I would mark such a point. First of all, winners.
A winner is simply a shot that won you the point that the opponent did not touch, but consider this scenario: a player hits a great serve and gets a very easy, very short forehand on top of the net that they promptly dispatch for a winner. What won the point? Marking that as a forehand winner doesn’t really convey an accurate picture of how that point was won. It pads the forehand numbers and tells us nothing about how good the serve was. Now what if the opponent guesses right and barely gets a play on the ball. Now it’s not counted as anything—it gets lost in the uncharted “forced error” category. Now the grey area of this becomes: how short and easy must it be to not warrant counting the forehand as a winner? What if it’s off a low slice? And my best answer is only to say that if the player is clearly inside the service box and the ball is above the net then that’s the area cut off for me. Such an instance usually means some prior shot has done the bulk of the work to earn such an easy ball. There is still plenty of subjectivity in that, but stats are there to aid us in telling the story more than tell the whole story anyway. The following examples will give you an idea of how I am planning to stat matches and judge shots moving forward. Of course, I might change this after getting some feedback as it is still a rough draft.
Winner under pressure
A winner under pressure usually involves the player having to move and hit the ball either on the run, from a deep position, or off a tough ball. Depth, spin, and pace of the incoming ball also factor in. What I also lump into this category are forcing shots (what is counted as forced errors for the opponent. Here I just count them as winners for the player). Here are some examples I am considering to be labelled as winners under pressure.
Now the second instance where Popyrin hits a forehand winner off the backhand return is something that I don’t think gets done by any other statistic, but I believe that marking it that way is a fairer representation of the point; it was a great backhand return that won him the point and should get reflected as a winner.
Winner no pressure
These are the bread-and-butter plays that top professionals are expected to make. The term “no pressure” is obviously just used to juxtapose the “under pressure” category—I agree with Craig that every shot has some element of pressure, but for the sake of clarity I’ll use this term. It usually involves the player having a shot with more time, a little shorter in the court, with the opponent pulled off to one side, etc. Below are some examples.
Note: I also consider balls that the opponent doesn’t get a decent play on, but they touch the ball, as winners for the player, like this second serve return from Alcaraz below:
Error under pressure
These are basically forced errors. What is hard to do is draw the line between calling it an error under pressure for one player, rather than a winner for the other player (as I am counting winners for a player even if the opponent touches the ball). I draw the line (approximately) when the player attempts a topspin shot—like with Auger Aliassime below on his backhand—but is being rushed or pulled wide. In this case, they may have been able to make the shot if they chose a more defensive mode, such as a slice or lob. So when the player is trying to play pure defense—usually slice, usually very stretched, as with Bautista Agut in the above example—that’s when it becomes more of a winner for the player, rather than an error under pressure.
Error no pressure
These are pretty straightforward. Probably what traditional tennis statistics label as unforced errors a lot of the time. The player is usually not moving, not rushed, or not in a terrible position to strike the ball.
Other Notes
First-serve returns aren’t counted. TennisTV usually accounts for this with their unreturned serve (URS) metric which is mostly a reflection of missed first serves by a returner. However, if the player hits a winner on a first-serve return I will count that. It happens pretty rarely but I think reflects great returns more than bad serves.
Second-serve returns are counted unless the serve is particularly hard/heavy/out of reach to a point where it clearly caught the returner off guard. I’m trying to capture just stock second-serve returns here—such as the Auger Aliassime miss above.
Drop shots, rally ball slices, volleys, smashes, very short swing volleys etc. are not counted. A deeply struck swinging volley on a forehand, for example, I would count as a forehand winner no pressure if the player hit it around or behind the service line.
As much of the game is ultimately founded on topspin baseline consistency, I am trying to capture how well players are hitting their topspin shots from the baseline. Starting with the Barcelona analysis this weekend, I’ll see how different the stats are between the various organizations. Usually, my numbers capture a lot more errors and a lot fewer winners, but I think this is a better way to assess matches going forward.
I’ve thrown this together pretty quickly, so let me know in the comments if there seems to be some glaring weakness/missing item here.
Nice article, as always. This can be solved with AI. An unforced error, should be based on a prediction model producing a probability of the average ATP player making the shot. If the probability exceeds a threshold (say 80%), then it's an unforced error. One could then adjust the threshold however one likes.
Serve + 1 seems a legitimate stat to capture effectiveness. It would be interesting if AI could track shot selection and come up with, essentially, a tennis IQ factor.