Extrapolated Runs

Last updated

Extrapolated Runs (XR) is a baseball statistic invented by sabermetrician Jim Furtado to estimate the number of runs a hitter contributes to his team. XR measures essentially the same thing as Bill James' Runs Created, but it is a linear weights formula that assigns a run value to each event, rather than a multiplicative formula like James' creation.

Contents

Purpose and formulae

According to Furtado, Extrapolated Runs was inspired by Paul Johnson's Estimated Runs Produced (ERP) formula, which was published in James' 1985 Baseball Abstract. Furtado found that Johnson's method, when written a different way, was essentially a linear weights formula (something James apparently did not recognize at the time, given his very public disdain for linear run estimators). ERP was almost as accurate as RC at measuring team runs, it did not succumb to RC's infamous problems at the individual level, and its values stacked up well when compared to Pete Palmer's linear weights formula, even though the two methods were developed in entirely different ways. For these reasons, Furtado believed that linear estimators had more promise than was originally thought, and he set out to develop his own.

After much trial and error (some of which involved borrowing concepts and weights from other linear formulas), Furtado eventually found a set of weights that best fit his sample (every Major League Baseball season from 1955 to 1997). He unveiled the formula in the 1999 Big Bad Baseball Annual:

"Extrapolated Runs was developed for use with seasons from 1955 to the present. I came up with three versions of the formula. The three formulas are:

"As you can see, calculating XR requires only addition and multiplication. Its simplicity of design is one of its greatest attributes. Unlike a lot of the other methods, you don't need to know team totals, actual runs, league figures or anything else. You just plug the stats into the formula and you are all set.

"Another of XR attributes is that the formula is pretty much context neutral. Other than park effects, the only remaining residue of context is due to the inclusion of IBB, GIDP and SF. Although I could have removed them from the full version, I felt that the inclusion of these terms was important since my research showed there was a strong correlation between the IBB, SF and GIDP opportunities that players face. I also felt, like Bill James, that these statistics do tell us something valuable about players. Of course, I knew some people might not agree with me. For them, I created two other versions.

"XR also accounts for just about every out. James correctly understands that the more outs an individual player consumes the less valuable his positive contributions are. Since XR will be used as the base of the Extrapolated Win method, I thought it was important to include as many outs as possible in the formula.

"Another nice thing about XR is that if you add up all the players' Extrapolated Runs, you'll have the team totals. That's a benefit of using a linear equation."

Pros and cons of extrapolated runs

Along with Palmer's Linear Weights, XR is the most accurate of the linear run estimators, in terms of predicting team runs scored. And unlike James' RC, it doesn't artificially inflate the runs produced by individual players who combine high OBPs and SLGs. It is also much easier to calculate than Base Runs.

However, like any linear formula, there is no guarantee that it will work outside of the context in which it was developed (in this case, seasons from 1955 to 1997).

See also

Related Research Articles

Baseball statistics play an important role in evaluating the progress of a player or team.

On-base plus slugging (OPS) is a sabermetric baseball statistic calculated as the sum of a player's on-base percentage and slugging percentage. The ability of a player both to get on base and to hit for power, two important offensive skills, are represented. An OPS of .800 or higher in Major League Baseball puts the player in the upper echelon of hitters. Typically, the league leader in OPS will score near, and sometimes above, the 1.000 mark.

Slugging percentage Hitting statistic in baseball

In baseball statistics, slugging percentage (SLG) is a measure of the batting productivity of a hitter. It is calculated as total bases divided by at bats, through the following formula, where AB is the number of at bats for a given player, and 1B, 2B, 3B, and HR are the number of singles, doubles, triples, and home runs, respectively:

Bill James American baseball writer and statistician

George William James is an American baseball writer, historian, and statistician whose work has been widely influential. Since 1977, James has written more than two dozen books devoted to baseball history and statistics. His approach, which he termed sabermetrics in reference to the Society for American Baseball Research (SABR), scientifically analyzes and studies baseball, often through the use of statistical data, in an attempt to determine why teams win and lose.

Runs created (RC) is a baseball statistic invented by Bill James to estimate the number of runs a hitter contributes to their team.

Equivalent Average (EqA) is a baseball metric invented by Clay Davenport and intended to express the production of hitters in a context independent of park and league effects. It represents a hitter's productivity using the same scale as batting average. Thus, a hitter with an EqA over .300 is a very good hitter, while a hitter with an EqA of .220 or below is poor. An EqA of .260 is defined as league average.

In numerical analysis, Romberg's method is used to estimate the definite integral

In baseball, defense-independent pitching statistics (DIPS) measure a pitcher's effectiveness based only on statistics that do not involve fielders. These include home runs allowed, strikeouts, hit batters, walks, and, more recently, fly ball percentage, ground ball percentage, and line drive percentage. By focusing on these statistics, which the pitcher has almost total control over, and ignoring what happens once a ball is put in play, which the pitcher has little control over, DIPS can offer a clearer picture of the pitcher's true ability.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function of the independent variable.

Total average is a baseball statistic devised by sportswriter Thomas Boswell and introduced in the 1978. It was also described in his 1982 article "Welcome to the world of Total Average where a walk is as good as a hit". It is designed to measure a hitter's overall offensive contributions, on the basis that "all bases are created equal". The statistic was included in issues of Inside Sports.

Speed Score, often simply abbreviated to Spd, is a statistic used in Sabermetric studies to evaluate a baseball player's speed. It was invented by Bill James, and first appeared in the 1987 edition of the Bill James Baseball Abstract.

Base runs (BsR) is a baseball statistic invented by sabermetrician David Smyth to estimate the number of runs a team "should have" scored given their component offensive statistics, as well as the number of runs a hitter or pitcher creates or allows. It measures essentially the same thing as Bill James' runs created, but as sabermetrician Tom M. Tango points out, base runs models the reality of the run-scoring process "significantly better than any other run estimator".

The 1999 Cleveland Indians are the only team in Major League Baseball since 1950 to score over 1,000 runs during the regular season. They were shut out only 3 times in 162 games. Five Indians scored at least 100 runs and four drove in at least 100 runs. Right fielder Manny Ramirez drove in 165 runs, the most by any MLB player since Jimmie Foxx in 1938.

The 2000 Cleveland Indians season was the 100th season for the franchise, within the American Major League Baseball organization. For the season two new Players were signed; Chris Coste and Mark Whiten. The results of the season consisted of 90 wins and 72 losses.

In baseball, isolated power or ISO is a sabermetric computation used to measure a batter's raw power. One formula is slugging percentage minus batting average.

Power–speed number or power/speed number (PSN) is a sabermetrics baseball statistic developed by baseball author and analyst Bill James which combines a player's home run and stolen base numbers into one number.

In baseball, wOBA is a statistic, based on linear weights, designed to measure a player's overall offensive contributions per plate appearance. It is formed from taking the observed run values of various offensive events, dividing by a player's plate appearances, and scaling the result to be on the same scale as on-base percentage. Unlike statistics like OPS, wOBA attempts to assign the proper value for each type of hitting event. It was created by Tom Tango and his coauthors for The Book: Playing the Percentages in Baseball.

Component ERA or ERC is a baseball statistic invented by Bill James. It attempts to forecast a pitcher's earned run average (ERA) from the number of hits and walks allowed rather than the standard formula of average number of earned runs per nine innings. ERC allows one to take a fresh look at a pitcher's performance and gauge if his results are more or less than the sum of its parts.

Wins Above Replacement or Wins Above Replacement Player, commonly abbreviated to WAR or WARP, is a non-standardized sabermetric baseball statistic developed to sum up "a player's total contributions to his team". A player's WAR value is claimed to be the number of additional wins his team has achieved above the number of expected team wins if that player were substituted with a replacement-level player: a player who may be added to the team for minimal cost and effort.

2011 Chicago Cubs season Major League Baseball season

The 2011 Chicago Cubs season was the 140th season of the Chicago Cubs franchise, the 136th in the National League and the 96th at Wrigley Field. The Cubs, under new manager Mike Quade, finished fifth in the National League Central with a record of 71–91. The Cubs displayed a patch on their uniforms to remember Cub broadcaster and player Ron Santo, who died in December 2010.