A Little Clutch Hitting Study

Last week’s post on clutch ability got me thinking about another way to identify clutch hitting. Instead of comparing performance in aggregate data, I wanted to look at the probability that a hitter would perform in an individual plate appearance using past performance metrics as predictors. The degree to which past clutch performance predicted actual performance would tell us something about clutch ability, while controlling for other factors.

So, as I watched Nick Punto surpass Lonnie Smith for the most-memorable baserunning error in Metrodome history, I pulled up an old data file (via Retrosheet) that Doug Drinen and I had used to study protection. I had a four-year sample of individual plate appearances from 1989–1992. I estimated each player’s performance with runners in scoring position (RISP) from 1989–1991 to see how it predicted 1992 performance in RSIP plate appearances. The idea is that if players have any ability to perform with higher stakes, then past performance in this area should affect the probability of success during individual plate appearances. The nice thing about such granular data is that it is possible to control for factors such as pitcher quality and the platoon advantage—effects that are difficult to tease out of aggregate data.

I used probit models to estimate the likelihood that a player would get a hit (1 = hit; 0 = otherwise), or get on base (1= hit, walk, or hbp; 0 = otherwise) controlling for the player’s seasonal performance in that area (AVG or OBP), RISP 1989–91 performance in that area, whether the the platoon advantage was in effect (1 = platton; 0 = otherwise), and the pitcher’s ability in that area. To test hitting power, I used the count regression negative binomial method to estimate the expected number of total bases during the plate appearance and used his RSIP SLG 1989–1991 as a proxy for clutch skill in this area.

The table below lists the marginal effect (X) of a change in the explanatory variable on the dependent variable. For example, a one-unit change in the explanatory variable is associated with an X-unit change in the dependent variable. For the probit estimates, this represents a change in probability. For the negative binomial estimates, this represents the expected change in total bases.

Variable	Hit		On Base		Total Bases
Overall		1.04		0.98		0.93
		[9.58]		[11.84]		[10.8]
RISP		-0.06162	0.00018		0.00012
		[1.02]		[3.65]		[1.32]
Pitcher		1.152		1.031		0.983
		[12.94]		[12.51]		[12.83]
Platoon		0.014		0.040		0.039
		[2.41]		[6.74]		[3.82]
Observations	23,197		26,820		23,197
Method		Probit		Probit		Neg. Binomial

Absolute robust z-statistics in brackets.			

The brackets below the variable list the z-statistics, where a statistic of 2 or above generally indicates a statistically meaningful relationship. In samples of this size, statistical significance isn’t difficult to achieve; therefore, it isn’t surprising that in all but two instances the variables are significant. The two that are insignificant are the past RISP performance in batting average and slugging average. Thus, clutch ability doesn’t appear to be strong here.

However, the estimate of a clutch effect is statistically significant for getting on base. Is this evidence for clutch ability? Well, let’s interpret the coefficient. Every one-unit increase in RISP OBP is associated with a 0.00018 increase in the likelihood of getting on base; thus, a player increasing his RISP OBP by 0.010 (10 OBP points) increases his on-base probability by 0.0000018. For practical purposes, there is no effect.

This study is by no means perfect, but the striking magnitude of the impacts between overall and clutch ability (just look at the differences in the Overall and RISP coefficients) in such a large sample shows why it’s best to remain skeptical regarding clutch ability. If players did have clutch skill, I believe it would show up in this test.

7 Responses “A Little Clutch Hitting Study”

  1. Stan says:

    I think you might have an error in your OBP model. The coefficient is so small that the correlation is virtually zero, so it’s hard to see how it could be significant. The coefficient in your BA model, for example, is more than 300 times larger yet still not significant.

    Also, if you want to see if prior clutch hitting predicts future clutch hitting, 1989-91 RISP is not the variable you want. You need a measure of the hitters’ clutch performance relative to their other performance, such as RISP(89-91) minus Overall OBP(89-91). In your model, a hitter’s 1992 and prior clutch performance (defined as RISP minus Overall) could be identical, and yet your RISP(89-91) variable still won’t have much power because it mostly tells you the hitter’s overall ability, not his clutch performance relative to his overall performance. (Of course, even if you do this correctly you probably still won’t find that past clutch differential has much predictive power.)

  2. JC says:

    The coefficient is small because it has very little practical impact. However, the effect is still more than twice the size of the standard error, hence statistically significant. OBP is far more stable than AVG so it’s not surprising that the latter flops around more across hitters, raising the the standard error.

    The second problem (if I understand it) is unlikely to occur. In any event, the indicator I used is imperfect for many reasons, but if “clutch” was there it should show up.

  3. P. W. Hjort says:

    Great work.

    This piece reminded me of a chapter in Baseball Between the Numbers by the Prospectus crew. The title was “Is David Ortiz really Mr. Clutch?”. Their initial principle was much the same as yours–if clutch hitting really exists, past performance should have predictive value. To test it, they used one of their situational win probability metrics (dubbed “clutch score”) and divided them into two groups: odd years and even years, encompassing a hitter’s entire career. They then generated a scatter plot with the “clutch score” in a hitter’s even years on the X axis and the “clutch score” in a hitter’s odd years on the Y axis. Their premise being if clutch hitting is a repeatable skill, a “clutch hitter” will have positive values in both their even years and odd years, leading to a positive correlation. This is that scatter plot. As you may have guessed, their conclusion–a conclusion shared by many who have studied the subject–was that if clutch hitting exists, it’s such a small piece of the puzzle that “it’s probably folly for a club to go looking for clutch hitters–the ability just isn’t important enough in the bigger scheme of things”.

  4. Stan says:

    OBP is only slightly more stable, due to larger sample size (about 10% more PAs than ABs). Hard to imagine the standard error for batting average should be more than 1000x larger than standard error for OBP.

    The problem with using the 3 year RISP variable is certain to occur, not unlikely. The problem is that 3 year RISP isn’t a pure measure of a player’s clutch ability (if such ability exists). Suppose Pujols is .420 overall and .400 with RISP, while Adam Everett is .300 and .320. Your regression doesn’t “know” that Pujol’s .400 is poor clutch performance, while Everett’s .320 is good clutch hitting. The difference between RISP and overall OBP, or something similar, is what you need.

  5. JC says:


    OBP is inherently more stable than AVG. It has nothing to do with sample size.

    The regression model controls for batter quality, so the relevant information is there.

  6. RJ says:

    Interesting article but where can i find RISP for individual players? i have found teams but nothing for players. thanks


  1. […] relevance of sample size. It is with great pleasure that I discovered J.C. Bradbury’s recent post on clutch hitting. Here is an overview: I used probit models to estimate the likelihood that a […]