Sabermetrics

Last updated
Bill James, who coined the term "sabermetrics" Bill James 2010.jpg
Bill James, who coined the term "sabermetrics"

Sabermetrics (originally SABRmetrics) is the original or blanket term for sports analytics, the empirical analysis of baseball, especially the development of advanced metrics based on baseball statistics that measure in-game activity. The term is derived from the movement's progenitors, members of the Society for American Baseball Research (SABR), founded in 1971, and was coined by Bill James,[ when? ] who is one of its pioneers and considered its most prominent advocate and public face. [1]

Contents

The term moneyball is used for the practice of using metrics to identify "undervalued players" and sign them to what ideally will become "below market value" contracts, which debuted in the efforts of small market teams to compete with the much greater resources of big market ones.

Early history

English-American sportswriter Henry Chadwick, the "father" of baseball statistics Henry Chadwick (NYPL b13537024-56451) (cropped).jpg
English-American sportswriter Henry Chadwick, the "father" of baseball statistics

English-American sportswriter Henry Chadwick developed the box score in New York City in 1858. This was the first way statisticians were able to describe the sport of baseball by numerically tracking various aspects of game play. [2] The creation of the box score has given baseball statisticians a summary of the individual and team performances for a given game. [3]

What would become the earliest Sabermetrics research in the 1970s and 1980s began in the middle of the 20th century with the writings of Earnshaw Cook, one of the earliest baseball analysts. Cook's 1964 book Percentage Baseball was one of the first of its kind. [4] At first, most organized baseball teams and professionals dismissed Cook's work as meaningless. The idea of a science of baseball statistics began to achieve legitimacy in 1977 when Bill James began releasing Baseball Abstracts, his annual compendium of baseball data. [5] [6] However, James's ideas were slow to find widespread acceptance. [1]

Bill James believed there was a widespread misunderstanding about how the game of baseball was played, claiming the sport was not defined by its rules but actually, as summarized by engineering professor Richard J. Puerzer, "defined by the conditions under which the game is played--specifically, the ballparks but also the players, the ethics, the strategies, the equipment, and the expectations of the public." [2] Early Sabermetricianssometimes considered baseball statisticiansbegan trying enhance such fundamental baseball statistics as batting average (simply at-bats divided by hits) with advanced mathematical formulations. [7] [8] The correlation between team batting average and runs scored was also examined, [7] as runs not hits win ballgames. Thus, a good measure of a player's worth would be his ability to help his team score runs, which was observed to be highly correlated with his number of times on base - leading to the development of a new stat, "on-base percentage".

MLB advanced metrics pioneer Davey Johnson (in 1986) Davey Johnson 1986.jpg
MLB advanced metrics pioneer Davey Johnson (in 1986)

Before Bill James popularized sabermetrics, Davey Johnson, then a second baseman playing for the early 1970s Baltimore Orioles of Major League Baseball (MLB), used an IBM System/360 at team owner Jerold Hoffberger's brewery to write a FORTRAN-based baseball computer simulation. In spite of his results, he was unable to persuade his manager Earl Weaver that he should bat second in the lineup. He wrote IBM BASIC programs to help him manage the Tidewater Tides, and after becoming manager of the New York Mets in 1984, he arranged for a team employee to write a dBASE II application to compile and store advanced metrics on team statistics. [9] Craig R. Wright was another employee in MLB, working with the Texas Rangers in the early 1980s. During his time with the Rangers, he became known as the first front office employee in MLB history to work under the title "sabermetrician". [10] [11]

David Smith founded Retrosheet in 1989, with the objective of computerizing the box score of every major league baseball game ever played, in order to more accurately collect and compare the statistics of the game.

Billy Beane as a player in 1989 Billy Beane 1989.jpg
Billy Beane as a player in 1989

The Oakland Athletics began to use a more quantitative approach to baseball by focusing on sabermetric principles in the 1990s. This initially began with Sandy Alderson as the general manager of the team when he used the principles toward obtaining relatively undervalued players. [1] His ideas were continued when Billy Beane took over as general manager in 1997, a job he held until 2015, and hired his assistant Paul DePodesta. [8] During the 2002 season a noted "moneyball" Oakland A's team went on to win 20 games in a row, [12] a term (and approach to the game) which soon gained national recognition when Michael Lewis published Moneyball: The Art of Winning an Unfair Game (where "unfair" reflected the disparity in resources available to the big market teams versus the small) in 2003 to detail Beane's use of advanced metrics. In 2011, a film based on Lewis' book—also called Moneyball —was released and gave broad exposure to the techniques used in the Oakland Athletics' front office.

Traditional measurements

Sabermetrics reflected a desire by a handful of baseball enthusiasts to expand their understanding of the game by revealing new insights that may have been hidden in its traditional statistics. Their early efforts ultimately evolved into evaluating players in every aspect of the game, including batting, pitching, baserunning, and fielding.

Batting measurements

Ted Williams, the last MLB player to bat .400 for a season (in 1941) Ted Williams 1940 Play Ball.jpeg
Ted Williams, the last MLB player to bat .400 for a season (in 1941)

A ballplayer's batting average (BA) (simply at bats divided by hits) was the historic measure of a player's offensive performance, enhanced by slugging percentage (SA) [lower-alpha 1] which incorporated their ability to hit for power.

Bill James, along with other early sabermetricians, was concerned that batting average did not incorporate other ways a batter can reach base besides a hit - as a batter on base can score runs, and runs, not hits, win ballgames. [13]

Even though slugging percentage and an early form of on-base percentage (OBP) which takes into accounts base on balls ("walks") and hit-by-pitches date to at least 1941, [14] pre-dating both Bill James (born 1949) and SABR (formed 1971), [13] enhanced focus was put on the relationship of times on base and run scoring by early SABR-era baseball statistical pioneers.

SA and OBP were combined to create the modern statistic on-base plus slugging (OPS). OPS is the sum of the on-base percentage and the slugging percentage. This modern statistic has become useful in comparing players and is a powerful method of predicting runs scored by any given player. [15] An enhanced version of OPS, "OPS+", incorporates OPS, historic statistics, ballpark considerations, and defensive position weightings to attempt to allow player performance from different eras to be compared.

Some other advanced metrics used to evaluate batting performance are weighted on-base average, secondary average, runs created, and equivalent average.

Pitching measurements

Ed Walsh, whose career 1.82 ERA is the lowest in MLB history Ed Walsh portrait 1911.jpg
Ed Walsh, whose career 1.82 ERA is the lowest in MLB history

The traditional measure of pitching performance is the earned run average (ERA). It is calculated as earned runs allowed per nine innings. Earned run average does not separate the ability of the pitcher from the abilities of the fielders that he plays with. [16] Another classic measure for pitching is a pitcher's winning percentage. Winning percentage is calculated by dividing wins by the total number of decisions (wins plus losses). Winning percentage is also heavily dependent on the pitcher's team, particularly on the number of runs it scores.

Sabermetricians have attempted to find different measures of pitching performance that exclude the performances of the fielders involved. One of the earliest developed, and one of the most popular in use, is walks plus hits per inning pitched (WHIP), which while not completely defense-independent, tends to indicate how many times a pitcher is likely to put a player on base (either via walk, hit-by-pitch, or base hit) and thus how effective batters are against a particular pitcher in reaching base.

A later development was the creation of defense independent pitching statistics (DIPS) system. Voros McCracken has been credited with the development of this system in 1999. [17] Through his research, McCracken was able to show that there is little to no difference between pitchers in the number of hits they allow on balls put into play—regardless of their skill level. [18] Some examples of these statistics are defense-independent ERA, fielding independent pitching, and defense-independent component ERA. Other sabermetricians have furthered the work in DIPS, such as Tom Tango who runs the Tango on Baseball sabermetrics website.

Baseball Prospectus created another statistics called the peripheral ERA. This measure of a pitcher's performance takes hits, walks, home runs allowed, and strikeouts while adjusting for ballpark factors. [16] Each ballpark has different dimensions when it comes to the outfield wall so a pitcher should not be measured the same for each of these parks. [19]

Batting average on balls in play (BABIP) is another useful measurement for determining pitchers’ performance. [18] When a pitcher has a high BABIP, they will often show improvements in the following season, while a pitcher with low BABIP will often show a decline in the following season. [18] This is based on the statistical concept of regression to the mean. Others have created various means of attempting to quantify individual pitches based on characteristics of the pitch, as opposed to runs earned or balls hit.

Advanced methods

Value over replacement player (VORP) was once considered a popular sabermetric statistic.[ clarify ] This statistic attempts to demonstrate how much a player contributes to his team in comparison to a hypothetical player performing at the minimum level needed to hold a roster position on a major league team. It was invented by Keith Woolner, a former writer for the sabermetric group/website Baseball Prospectus.

Wins above replacement (WAR) is another popular sabermetric statistic for evaluating a player's contributions to his team. [20] Similar to VORP, WAR compares a given player to a replacement-level player in order to determine the number of additional wins the player provides to his team relative to an average ballplayer at his position. [21] WAR, like VORP a cumulative statistic, heavily reflects the amount of a player's playing time. [21]

"Static" statistics based on simple ratios of already accumulated date (like batting average) and accumulative tallies (such as pitching wins) do not fully reveal all aspects of the game represented in their numeric totals. [22] :189–198 Advanced metrics are increasingly developed and targeted to addressing in-game activities (such as when a team should attempt to steal a base, [23] and when to bring closers in).

Applications

Sabermetrics are commonly used for everything from sportswriting to baseball Hall of Fame consideration, selecting player match-ups and evaluating in-game strategic options. Advanced statistical measures may be utilized in determining in-season and end-of-the-season awards (such as Player of the Week and MVP). Those which are most useful in evaluating past performance and predicting future outcomes are valuable in determining a player's contributions to his team, [15] potential trades, contract negotiations, and arbitration.

Recently,[ when? ] sabermetrics has been expanded to examining ballplayer minor league performance in AA and AAA ball in a manner similar to evaluating it at the Major League level, known as Minor-League Equivalency. [15]

AI in sabermetrics

Machine learning and other forms of artificial intelligence (AI) can be usefully applied to predicting future outcomes in baseball modeling, in in-game strategy, personnel handling, and roster-building and contract negotiations.

Advancements from 1985 - present

Bill James' two books, The Bill James Historical Baseball Abstract (1985) and Win Shares (2002) have continued to advance the field of sabermetrics. [24] The work of his former assistant Rob Neyer, who later became a senior writer at ESPN.com and national baseball editor of SBNation, also contributed to popularizing sabermetrics since the mid-1980s. [25]

Nate Silver, a former writer and managing partner of Baseball Prospectus, invented PECOTA (Player Empirical Comparison and Optimization Test Algorithm [26] ) in 2002–2003, introducing it to the public in the book Baseball Prospectus in 2003. [27] It assumes that the careers of similar players will follow a similar trajectory. [28]

Beginning in the 2007 baseball season, MLB started looking at technology to record detailed information regarding each pitch that is thrown in a game. This became known as the PITCHf/x system, which uses video cameras to record pitch speed at its release point and crossing the plate, location, and angle (if any) of a break. [13]

FanGraphs is a website that utilizes this information and other play-by-play data to publish advanced baseball statistics and graphics. [29]

See also

Notes

  1. Calculated by dividing total bases (the non-situational cumulative tally of all hits) by the total number of times at bat.

Related Research Articles

<span class="mw-page-title-main">Baseball statistics</span>

Baseball statistics include a variety of metrics used to evaluate player and team performance in the sport of baseball.

<span class="mw-page-title-main">On-base percentage</span> Hitting statistic in baseball

In baseball statistics, on-base percentage (OBP) measures how frequently a batter reaches base. An official Major League Baseball (MLB) statistic since 1984, it is sometimes referred to as on-base average (OBA), as it is rarely presented as a true percentage.

On-base plus slugging (OPS) is a sabermetric baseball statistic calculated as the sum of a player's on-base percentage and slugging percentage. The ability of a player both to get on base and to hit for power, two important offensive skills, are represented. An OPS of .800 or higher in Major League Baseball puts the player in the upper echelon of hitters. Typically, the league leader in OPS will score near, and sometimes above, the 1.000 mark.

<span class="mw-page-title-main">Slugging percentage</span> Hitting statistic in baseball

In baseball statistics, slugging percentage (SLG) is a measure of the batting productivity of a hitter. It is calculated as total bases divided by at-bats, through the following formula, where AB is the number of at-bats for a given player, and 1B, 2B, 3B, and HR are the number of singles, doubles, triples, and home runs, respectively:

<span class="mw-page-title-main">Bill James</span> American baseball writer and statistician

George William James is an American baseball writer, historian, and statistician whose work has been widely influential. Since 1977, James has written more than two dozen books about baseball history and statistics. His approach, which he named sabermetrics after the Society for American Baseball Research (SABR), scientifically analyzes and studies baseball, often through the use of statistical data, in an attempt to determine why teams win and lose.

In baseball, a sacrifice bunt is a batter's act of deliberately bunting the ball, before there are two outs, in a manner that allows a baserunner to advance to another base. The batter is almost always put out, and hence sacrificed, but sometimes reaches base on an error or fielder's choice. In that situation, if runners still advance bases, it is still scored a sacrifice bunt instead of the error or the fielder's choice. Sometimes the batter may safely reach base by simply outrunning the throw to first; this is not scored as a sacrifice bunt but rather a single.

<span class="mw-page-title-main">Billy Beane</span> American baseball player and executive (born 1962)

William Lamar Beane III is an American former professional baseball player and current front office executive. He is currently senior advisor to owner John Fisher and minority owner of the Oakland Athletics of Major League Baseball (MLB) and formerly the executive vice president of baseball operations. He is also minority owner of soccer clubs Barnsley of the EFL League One in England and AZ Alkmaar of the Eredivisie in the Netherlands. From 1984 to 1989 he played in MLB as an outfielder for the New York Mets, Minnesota Twins, Detroit Tigers, and Oakland Athletics. He joined the Athletics' front office as a scout in 1990, was named general manager after the 1997 season, and was promoted to executive vice president after the 2015 season.

In baseball, value over replacement player is a statistic popularized by Keith Woolner that demonstrates how much a hitter or pitcher contributes to their team in comparison to a replacement-level player who is an average fielder at that position and a below average hitter. A replacement player performs at "replacement level," which is the level of performance an average team can expect when trying to replace a player at minimal cost, also known as "freely available talent."

<i>Moneyball: The Art of Winning an Unfair Game</i> 2003 book by Michael Lewis

Moneyball: The Art of Winning an Unfair Game is a book by Michael Lewis, published in 2003, about the Oakland Athletics baseball team and its general manager Billy Beane. It describes the team's sabermetric approach to assembling a competitive baseball team on a small budget. It was adapted into the 2011 film Moneyball, starring Brad Pitt and Jonah Hill.

<span class="mw-page-title-main">Baseball Prospectus</span> Baseball analytics media company

Baseball Prospectus (BP) is an organization that publishes a website, BaseballProspectus.com, devoted to the sabermetric analysis of baseball. BP has a staff of regular columnists and provides advanced statistics as well as player and team performance projections on the site. Since 1996 the BP staff has also published a Baseball Prospectus annual as well as several other books devoted to baseball analysis and history.

<i>Whatever Happened to the Hall of Fame?</i> 1994 book by Bill James

Whatever Happened to the Hall of Fame?: Baseball, Cooperstown, and the Politics of Glory is a book by baseball sabermetrician and author Bill James. Originally published in 1994 as The Politics of Glory, the book covers the unique history of the Baseball Hall of Fame, the evolution of its standards, and arguments for individual players in a typically Jamesian, stat-driven manner. James drives home early on the heated and biased nature of Hall of Fame arguments between fans and writers alike. He states that his goal is not to serve individual players or candidates but to "reinforce the truth in what other people say" and to "serve the argument itself."

Robert "Voros" McCracken is an American baseball sabermetrician. "Voros" is a nickname from his partial Hungarian heritage. He is widely recognized for his pioneering work on Defense Independent Pitching Statistics (DIPS).

Catcher's ERA (CERA) in baseball statistics is the earned run average of the pitchers pitching when the catcher in question is catching. Its primary purpose is to measure a catcher's game-calling, rather than his effect on the opposing team's running game. Craig Wright first described the concept of CERA in his 1989 book The Diamond Appraised. With it, Wright developed a method of determining a catcher's effect on a team's pitching staff by comparing pitchers' performance when playing with different catchers.

PECOTA, an acronym for Player Empirical Comparison and Optimization Test Algorithm, is a sabermetric system for forecasting Major League Baseball player performance. The word is a backronym based on the name of journeyman major league player Bill Pecota, who, with a lifetime batting average of .249, is perhaps representative of the typical PECOTA entry. PECOTA was developed by Nate Silver in 2002–2003 and introduced to the public in the book Baseball Prospectus 2003. Baseball Prospectus (BP) has owned PECOTA since 2003; Silver managed PECOTA from 2003 to 2009. Beginning in Spring 2009, BP assumed responsibility for producing the annual forecasts, making 2010 the first baseball season for which Silver played no role in producing PECOTA projections.

Base runs (BsR) is a baseball statistic invented by sabermetrician David Smyth to estimate the number of runs a team "should have" scored given their component offensive statistics, as well as the number of runs a hitter or pitcher creates or allows. It measures essentially the same thing as Bill James' runs created, but as sabermetrician Tom M. Tango points out, base runs models the reality of the run-scoring process "significantly better than any other run estimator".

Earnshaw Cook was an American early researcher and proponent of sabermetrics, the analysis of baseball through statistical means.

Baseball Think Factory, abbreviated as BTF or BBTF, was a sabermetrically-oriented baseball web site which featured daily news stories in baseball, with original content contributed by SABR members such as Dan Szymborski. The site was previously branded as Baseball Primer, and was created in 2001 by the founders of Baseball-Reference. Contributors who have gone on to work for Major League Baseball front offices include Voros McCracken, Carlos Gomez, and Tom Tango. Bill James' Baseball Abstract books published in the 1980s are widely considered to be the modern predecessor to websites using sabermetrics such as baseballthinkfactory and baseballprospectus.

<span class="mw-page-title-main">Earned run average</span> Baseball statistic

In baseball statistics, earned run average (ERA) is the average of earned runs allowed by a pitcher per nine innings pitched. It is determined by dividing the number of earned runs allowed by the number of innings pitched and multiplying by nine. Thus, a lower ERA is better. Runs resulting from passed balls, defensive errors, and runners placed on base at the start of extra innings are recorded as unearned runs and omitted from ERA calculations.

Wins Above Replacement or Wins Above Replacement Player, commonly abbreviated to WAR or WARP, is a non-standardized sabermetric baseball statistic developed to sum up "a player's total contributions to his team". A player's WAR value is claimed to be the number of additional wins his team has achieved above the number of expected team wins if that player were substituted with a replacement-level player: a player who may be added to the team for minimal cost and effort.

Sports analytics are collections of relevant historical statistics that can provide a competitive advantage to a team or individual by helping to inform players, coaches and other staff and help facilitate decision-making both during and prior to sporting events. The term "sports analytics" was popularized in mainstream sports culture following the release of the 2011 film Moneyball. In this film, Oakland Athletics general manager Billy Beane relies heavily on the use of baseball analytics to build a competitive team on a minimal budget, building upon and extending the established practice of Sabermetrics.

References

  1. 1 2 3 Lewis, Michael M. (2003). Moneyball: The Art of Winning an Unfair Game. New York: W. W. Norton. ISBN   0-393-05765-8.
  2. 1 2 Puerzer, Richard J. (Fall 2002). "From Scientific Baseball to Sabermetrics: Professional Baseball as a Reflection of Engineering and Management in Society". NINE: A Journal of Baseball History and Culture. 11: 34–48. doi:10.1353/nin.2002.0042. S2CID   154849268.
  3. "The Hall of Famers - Henry Chadwick". Archived from the original on 2008-04-12.
  4. Albert, James; Jay M. Bennett (2001). Curve Ball: Baseball, Statistics, and the Role of Chance in the Game. Springer. pp. 170–171. ISBN   0-387-98816-5.
  5. "Bill James, Beyond Baseball". Think Tank with Ben Wattenberg . PBS. June 28, 2005. Retrieved November 2, 2007.
  6. Ackman, D. (May 20, 2007). "Sultan of Stats". The Wall Street Journal . Retrieved November 2, 2007.
  7. 1 2 Jarvis, J. (2003-09-29). "A Survey of Baseball Player Performance Evaluation Measures" . Retrieved 2007-11-02.
  8. 1 2 Kipen, D. (June 1, 2003). "Billy Beane's brand-new ballgame". San Francisco Chronicle . Retrieved November 2, 2007.
  9. Porter, Martin (1984-05-29). "The PC Goes to Bat". PC Magazine . p. 209. Retrieved 24 October 2013.
  10. RotoJunkie – Roto 101 – Sabermetric Glossary (powered by evoArticles) Archived 2007-09-10 at the Wayback Machine
  11. BaseballsPast.com
  12. "Franchise Timeline".
  13. 1 2 3 Albert, Jim (2010). "Sabermetrics: The Past, the Present, and the Future" (PDF). In Joseph A. Gallian (ed.). Mathematics and Sports. Vol. 43. Contributor : Mathematical Association of America. MAA. pp. 3–14. ISBN   9780883853498. JSTOR   10.4169/j.ctt6wpwsw.4.
  14. Powers, Jimmy (June 3, 1941). "The PowerHouse (column)". Daily News . New York City. p. 45. Retrieved January 30, 2023 via newspapers.com.
  15. 1 2 3 Grabiner, David J. "The Sabermetric Manifesto". The Baseball Archive.
  16. 1 2 McCracken, Voros (January 23, 2001). "Pitching and Defense: How Much Control Do Hurlers Have?". Baseball Prospectus.
  17. Basco, Dan; Davies, Michael (Fall 2010). "The Many Flavors of DIPS: A History and an Overview". Baseball Research Journal. 32 (2).
  18. 1 2 3 Ball, Andrew (January 17, 2014). "How has sabermetrics changed baseball?". Beyond the Box Score.
  19. Baumer, Benjamin; Zimbalist, Andrew (2014). The Sabermetric Revolution: Assessing the Growth of Analytics in Baseball. University of Pennsylvania Press.
  20. Fangraphs: WAR
  21. 1 2 Schoenfield, David (July 19, 2012). "What we talk about when we talk about WAR". ESPN.com.
  22. John T. Saccoman; Gabriel R. Costa; Michael R. Huber (2009). Practicing Sabermetrics: Putting the Science of Baseball Statistics to Work. United States of America: McFarland & Company. ISBN   978-0-7864-4177-8.
  23. "The Changing Caught-Stealing Calculus | FanGraphs Baseball". FanGraphs Baseball. Retrieved 2016-12-06.
  24. Neyer, Rob (November 5, 2002). "Red Sox hire James in advisory capacity". ESPN.com . Retrieved March 7, 2009.
  25. Jaffe, C. (October 22, 2007). "Rob Neyer Interview". The Hardball Times . Retrieved November 2, 2007.
  26. "Baseball Prospectus: Glossary". www.baseballprospectus.com. Retrieved 2016-05-05.
  27. Nate Silver, "Introducing PECOTA," in Gary Huckabay, Chris Kahrl, Dave Pease et al., Eds., Baseball Prospectus 2003 (Dulles, VA: Brassey's Publishers, 2003): 507–514.
  28. "Baseball Prospectus" . Retrieved 2012-03-04.
  29. "FanGraphs Baseball | Baseball Statistics and Analysis". FanGraphs Baseball. Retrieved 2024-05-26.