clock menu more-arrow no yes mobile

Filed under:

The Times-Through-the-Order Penalty Transformed the MLB Playoffs. Is the Reliever Familiarity Effect Next?

As teams go to their bullpens earlier and earlier to avoid one kind of penalty, they may be running right into another. What do we know about the repeat-reliever effect, and how might it impact the World Series?

Getty Images/Ringer illustration

In the top of the third inning of NLCS Game 7, the Arizona Diamondbacks’ Ketel Marte led off against Philadelphia Phillies starter Ranger Suárez, whom Marte was facing for the second time that day. TBS broadcaster Brian Anderson remarked that the game felt like a race to the sixth inning, implying that the sixth was when the two teams could count on reaching the relatively safe harbors of the back of their bullpens. “We met with both managers, asked them about their starters,” recounted his colleague Ron Darling, a former starter himself. “They couldn’t wait to talk about their bullpen. They had no interest in talking about the starters.”

Anderson recapped how the conversation with Phillies manager Rob Thomson had gone. “Tell us about Ranger Suárez, Rob,” he reenacted, then supplied Thomson’s answer himself: “‘Well, our bullpen’s fully loaded.’” Darling chimed in with a “Well rested!” Everybody in the booth chuckled. Modern managers do the darndest things.

Granted, this was a win-or-go-home game, with two off days on the docket for the victors (and many more than 100 ahead for the losers). And ultimately, the teams’ respective third starters stayed in for 4 innings (the Diamondbacks’ Brandon Pfaadt) and 4 2/3 (Súarez, who was charged with two runs in the fifth). That’s essentially par for this postseason, in which pitchers have averaged 4.4 innings per start, not excluding openers. Which only hammers home how quickly and dramatically managers’ mindsets have flipped. In the playoffs, especially, games are no longer about how long starters can stay in, but how quickly they can clear out so relievers can come in.

You may have seen a graph like the following one before. It shows the yearly percentage of postseason innings thrown by relievers. Lately, the rate of reliever-thrown frames has hovered around 50 percent.

Although the rates fluctuated more from year to year in the early seasons in the sample, when the postseason consisted of the World Series alone, the general trend over time has been toward a higher percentage of innings thrown by relievers. And that increase has accelerated in the past several years.

The primary reason for that recent spike is a growing awareness of and deference to the times through the order penalty, or the tendency for starting pitchers to perform worse with each successive matchup against a given hitter in the same game. As teams have internalized the magnitude of that effect, they’ve sent starters to the showers earlier and earlier, particularly in the relatively off-day-rich playoffs, resulting in fewer and fewer repeat exposures.

In the eight postseasons from 2016 to this one, only 8.8 percent of all postseason plate appearances came against a starter the hitter was facing for the third time or more in that game. In the eight preceding postseasons, 2008 to 2015, that rate was almost twice as high (16.2 percent). And in the eight postseasons from, say, 1959 through 1966, that rate was almost three times as high as the current one (24.1 percent). From teams’ perspectives, suppressing this figure seems to make sense: Even when a starter has been sailing through the early innings, a fresh reliever is still usually a better bet. Very rarely these days does a skipper decide to place a worse bet: If Diamondbacks manager Torey Lovullo says he plans to pull Pfaadt after 18 batters, plus or minus four, that’s what will happen, often after exactly 18.

OK, so why am I telling you all this? These developments are nothing new; they were all in evidence when I lamented the loss of the sport’s traditional starting-pitcher protagonist five Octobers ago. Since then, the starter has further receded in prominence: No starter has gotten more than 21 outs in a game this postseason, which will be a playoff first unless some successful World Series starter puts 2023 on the board. Starters’ lighter workloads make themselves felt on a cumulative level too: This season was the first in major league history, save for 1871-72 (the first two years of what Baseball Reference considers major league play) and 1942-43 (the first two baseball seasons during the U.S.’s official involvement in World War II), when there wasn’t a single active starting pitcher under 30 years old who’d amassed more than 20 pitching WAR. (Max Fried and Sandy Alcantara came close to qualifying, but both were hurt at times—another frequent cause of starter anonymity.)

Well, what if I told you the times-through-the-order effect might be missing something? What if the same sabermetric movement that established and reinforced the rigid, “third time through and so are you” model of managing could have the power to complicate and push back against this new October orthodoxy? And what if the World Series, which starts on Friday, might hinge on the Diamondbacks’ and Rangers’ recognition of this heretofore hidden effect? Is that something you might be interested in?

It’s time to consider a new hidden pitfall of pitcher usage: the repeat-reliever effect.


The discovery of the times-through-the-order penalty is often traced back to 2006 sabermetric manual The Book, or to Retrosheet founder David Smith’s 1996 study, “Do Batters Learn During a Game?” But the effect existed long before that—possibly since pitchers first started trying in earnest to get batters out—and its existence was long suspected, too. By “long,” I don’t just mean that Ted Williams basically described it in his 1970 book The Science of Hitting. I’m talking about the 19th century. To be specific: June 12, 1889, which may have been the first time a pitching move was made for essentially the same reason Kevin Cash pulled Blake Snell more than 130 years later.

John T. Brush, the owner of the National League’s Indianapolis Hoosiers, was an outsider who bought into baseball, and he had some strange ideas—one of which was to pull his starters before they could blow the game. As one newspaper account explained, “As the team has been losing games in the seventh inning, generally through the inability of the pitcher to hold up, Manager [Frank] Bancroft yesterday concluded to make a change at the end of the fifth, and for that reason, and not because [Bill] Burdick was being hit hard, [Pretzels] Getzien went in at the opening of the sixth.”

It worked, that time. As soon as it stopped working, though, the pile-on began. Does this sound familiar?

Straight from 21st-century sports radio! The experiment, more than a century ahead of its time, soon ended. “Brush wasn’t a back-down kind of guy, but the idea of data-driven decision making was so far outside the general consciousness that even he wasn’t willing to stick to his guns,” says historian Richard Hershberger, author of Strike Four: The Evolution of Baseball.

Of course, that proto-proactive pulling of starters was based on a hunch that didn’t carry the potential downside of doing the same thing in a postseason series: showing your opponent your bullpen pitchers’ hands one too many times. Going into Game 6 of the 2021 ALCS, The Boston Globe’s Alex Speier talked to multiple Red Sox relievers about the perils of facing the same hitters repeatedly in a short span of time. “Late in a series,” Speier wrote, “few secrets remain. Opposing hitters know which relievers they’re likely to face. Moreover, some hitters have had multiple looks at those relievers. They not only have first-hand familiarity with the movement and shape of the pitches, they’re also able to identify pitch sequences that pitchers like to employ.”

Then-Sox reliever Adam Ottavino told Speier he was aware of the threat. “I do think that’s a thing that hurt our team in ’19 in the playoffs,” Ottavino said, referring to the Yankees’ ALCS loss to Houston when Ottavino was in New York. “We went so much matchup-based we kind of overexposed ourselves against individual hitters. So I am very cognizant of that going forward. So far this series, I’ve faced Carlos [Correa] twice, I’ve faced [Yuli] Gurriel twice. So moving forward, I have to decide what to do each and every time, how much they can adjust and how much I want to adjust. You could definitely get overexposed. There’s a reason why we’re relievers.” (Among starters, pitchers with fewer pitch types tend to have more pronounced penalties, presumably because hitters can easily study their stuff. And relievers tend to have more limited repertoires than starters.)

In Game 6 in 2021, Ottavino got Correa to ground out, but he then gave up a backbreaking, put-the-pennant-on-ice homer to Kyle Tucker, whom Ottavino had faced once earlier in the series. Tucker’s homer might have happened regardless, but perhaps that previous plate appearance made the difference for Tucker, or Ottavino was slightly distracted by the challenge of yet again facing Correa and Gurriel (who wound up singling). If hitters can learn how to exploit a pitcher’s weakness within one game, why not across multiple games played in quick succession?

In early January 2022, I mentioned my interest in a potential repeat-reliever effect to analyst Cameron Grove, creator of PitchingBot. Grove took a preliminary look and found evidence that relievers do worse with each time they face the same hitter in a postseason series, even though the quality of their pitches typically doesn’t drop.

He later followed up on and bolstered his findings on his blog, where he concluded, “It seems that relievers get worse results the more often the same hitter sees them.” Although he noted that repeat matchups shouldn’t always be avoided, because “in many cases, your closer with a familiarity penalty will still be a better option than a mop-up guy who the hitter hasn’t seen before,” he cautioned that “in the playoffs, with up to seven games per series, teams may need to be careful about over-exposing their best reliever to opposition hitters in situations which are not high leverage.”

The Guardians hired Grove early this year, precluding further public research from him, but another researcher who was unaware of Grove’s work, Dr. David J. Gordon, independently took another approach to the topic and arrived at a similar conclusion in the forthcoming fall 2023 edition of the SABR Baseball Research Journal. Gordon’s study, “Balancing Starter and Bullpen Workloads in a Seven-Game Postseason Series,” reports large, statistically significant penalties incurred by relievers the more they face the same team in one series. “I would say that this effect, on the average, is as big or bigger than the times-through-the-order effect,” Gordon told me this week.

Gordon’s paper concludes that “the conventional wisdom of limiting most starting pitchers to two turns through the order is generally a sound strategy during the regular season, when there are no seven-game series and fresh relievers can be easily shuttled in from Triple-A. … But this calculus does not hold for postseason series, in which rosters are fixed and the opponent does not change. In the postseason, limiting starting pitchers in this manner inevitably opens the door to overwork and overexposure of relief pitchers.”

In other words, trying to avoid one kind of penalty in the playoffs may have caused teams to run right into another. Indeed, it does appear that relievers whom hitters have already faced once in the same best-of-five or best-of-seven series make up a higher percentage of all plate appearances and “close and late” plate appearances (those in the seventh inning or later, with a score differential of no more than three runs in either direction) than they used to.

Somewhat surprisingly, the rates of plate appearances against relievers that hitters have previously faced two times or three or more times in the same series haven’t increased, probably because bullpens are much bigger than they used to be and teams are growing more reluctant to use relievers in many consecutive games. (Hence, the bullpen innings that are replacing starter innings are distributed among more arms.)

Gordon acknowledges that his research can’t conclusively answer the always-thorny question of whether the penalty stems from familiarity, fatigue, or both; in a postseason series, it’s tough to untangle those competing (though not mutually exclusive) explanations, because a reliever who faces one opponent many times will have to have worked without a lot of rest. In a 2016 study, I found no evidence of a familiarity effect for starting pitchers who make multiple full-rest starts against the same playoff opponent in the same series; starters do pitch worse when they make a second start on short rest within the same series, but they don’t do any worse when they make a second start on regular rest.

Whatever the explanation, let’s say that this repeat-reliever penalty is real and spectacular. In theory, its widespread acceptance could lead to longer outings for postseason starting pitchers, and much more interesting strategic and tactical decisions. In place of push-button decisions to pull starters as soon as their third time through the order rolls around, we would have skippers taking the long view and sometimes rolling the dice with a starter so as to avoid putting relievers in positions to fail later in the series. In addition, fans and media members would have more interesting arguments to make than the rote, “third through time” argument-enders we often seem to be stuck with now.

But even if fans would often applaud leaving a starter in longer, would managers be bold enough to weather the crowd’s slings and arrows when they inserted someone other than their best available bullpen arm for a certain situation, with an eye toward keeping that pitcher fresh for an even higher-leverage moment? Would they be nimble enough to adjust how much they prioritize the times-through-the-order effect when the calendar flips from September to October? Are any teams talking about these issues?

To find out, I asked front-office folks from three of this year’s playoff teams. “It’s talked about, but I’m not sure that it is a factor often enough for anyone to consider it strongly in roster construction during the [regular] season,” one answered. “The postseason is obviously a different animal, though. I would imagine some teams score pitcher availability for a game in a way that considers that penalty.”

Another was skeptical that a repeat-reliever penalty could overcome the inertia of the forces that have caused teams to take the ball from starters earlier and earlier. “I could buy that as a trend,” they said, “although I’m not sure it would ultimately really influence decision-making. … You won’t use your worst playoff reliever in a big spot just because he hasn’t appeared yet. And I also doubt this’ll be enough to reverse what we’ve seen with starting-pitcher pitch counts and times-through-the-order sensitivities.”

When I asked the third if this potential penalty is something teams are or will be taking into account, they estimated that 20 percent already are, 70 percent will be, and 10 percent are too set in their ways or analytics-uncurious to care. It’s possible that just as the times-through-the-order effect morphed from a mostly unknown factor to one of the most discussed aspects of the playoffs, the repeat-reliever effect could catch on quickly too. Instead of longer-lasting starters, would we see the return of multi-inning relief aces? Are lefty hitters extra valuable in October because teams sic the same southpaw relievers on them over and over, thereby penalizing those pitchers? If the effect does come from familiarity, does a hitter have to stand in the box to derive the advantage, or does spectating from the sideline suffice? Is the only lasting solution to the problems of fungible starters, max-effort injuries, and rising strikeout rates something bold, like moving the mound back or capping the number of pitchers on a team’s active roster at 11 or 12?

We may have to wait awhile for answers to those questions, but we won’t have to wait long to find out who’ll win the long-shot showdown between the Rangers and Diamondbacks. If ever a World Series were set up to illustrate the perils of repeat reliever use, it’s this one. Texas and Arizona were good or great at some things during the regular season, but bullpenning wasn’t one of them. The Rangers and Diamondbacks ranked 23rd and 24th in reliever FanGraphs WAR during the regular season. (They were even worse—26th and 28th—after the All-Star break.) Only one previous World Series matchup featured two teams with a worse combined full-season bullpen WAR rank, and that one also featured Texas: The Rangers ranked 25th in 2011, and the Cardinals came in at 27th (though they leaned heavily on their pen en route to their title). On paper, at least, this is a formula for the comebacks and lead changes we were lacking in the playoffs’ early rounds—and, perhaps, the first zombie-runner-free extra inning of October 2023.

Admittedly, the Diamondbacks have rebuilt the back of their bullpen, which has been much better of late in a small-ish sample. (Snakes relievers recorded the third-best ERA of September/October in the regular season, albeit with middling peripherals.) But both of these bullpens have small circles of trust: The Rangers ride or die with José Leclerc, Josh Sborz, and (gulp) Aroldis Chapman, while the Diamondbacks depend on Paul Sewald, Kevin Ginkel, and Ryan Thompson. Both teams got away with using each of those arms in most of their seven championship series games, but the bubble could pop at some point. Neither team’s starting staff is stacked either, so there aren’t a lot of awesome options here. But if Bruce Bochy and Torey Lovullo try to reuse relievers too many times, they could pay a steep penalty whose causes and consequences baseball analysts are just starting to understand.

Thanks to Lucas Apostoleris, Sean Dolinar, Richard Herschberger, Kenny Jackelen, and Cecilia Tan for research assistance.