Win probability

Last updated

Win probability is a statistical tool which suggests a sports team's chances of winning at any given point in a game, based on the performance of historical teams in the same situation. [1] The art of estimating win probability involves choosing which pieces of context matter. Baseball win probability estimates often include whether a team is home or away, inning, number of outs, which bases are occupied, and the score difference. Because baseball proceeds batter by batter, each new batter introduces a discrete state. There are a limited number of possible states, and so baseball win probability tools usually have enough data to make an informed estimate.

Baseball Sport

Baseball is a bat-and-ball game played between two opposing teams who take turns batting and fielding. The game proceeds when a player on the fielding team, called the pitcher, throws a ball which a player on the batting team tries to hit with a bat. The objectives of the offensive team are to hit the ball into the field of play, and to run the bases—having its runners advance counter-clockwise around four bases to score what are called "runs". The objective of the defensive team is to prevent batters from becoming runners, and to prevent runners' advance around the bases. A run is scored when a runner legally advances around the bases in order and touches home plate. The team that scores the most runs by the end of the game is the winner.

Contents

American football win probability estimates often include whether a team is home or away, the down and distance, score difference, time remaining, and field position. American football has many more possible states than baseball with far fewer games, so football estimates have a greater margin of error. The first win probability analysis was done in 1971 by Robert E. Machol and former NFL quarterback Virgil Carter.

American football Team field sport

American football, referred to as football in the United States and Canada and also known as gridiron, is a team sport played by two teams of eleven players on a rectangular field with goalposts at each end. The offense, which is the team controlling the oval-shaped football, attempts to advance down the field by running with or passing the ball, while the defense, which is the team without control of the ball, aims to stop the offense's advance and aims to take control of the ball for themselves. The offense must advance at least ten yards in four downs, or plays, and otherwise they turn over the football to the defense; if the offense succeeds in advancing ten yards or more, they are given a new set of four downs. Points are primarily scored by advancing the ball into the opposing team's end zone for a touchdown or kicking the ball through the opponent's goalposts for a field goal. The team with the most points at the end of a game wins.

Robert Engel Machol was an American systems engineer and professor of systems at the Kellogg Graduate School of Management of Northwestern University. Machol wrote the earliest significant books directly related to systems engineering. He was also Chief Scientist for the Federal Aviation Administration, President of the Operations Research Society of America, and an encyclopedia editor.

Quarterback position in gridiron football

A quarterback, colloquially known as the "signal caller", is a position in American and Canadian football. Quarterbacks are members of the offensive team and line up directly behind the offensive line. In modern American football, the quarterback is usually considered the leader of the offensive team, and is often responsible for calling the play in the huddle. The quarterback also touches the ball on almost every offensive play, and is the offensive player that almost always throws forward passes.

As a brief example, guessing that each team playing at home will win is based on home advantage. This guess uses a single contextual factor and involves a very large number of games. But with only one factor, the accuracy of this guess is limited to home advantage itself (about 55–70% across sports) and does not change within the game based on in-game factors.

In team sports, the term home advantage – also called home ground, home field, home-field advantage, home court, home-court advantage, defender's advantage or home-ice advantage – describes the benefit that the home team is said to gain over the visiting team. This benefit has been attributed to psychological effects supporting fans have on the competitors or referees; to psychological or physiological advantages of playing near home in familiar situations; to the disadvantages away teams suffer from changing time zones or climates, or from the rigors of travel; and, in some sports, to specific rules that favor the home team directly or indirectly. In baseball, in particular, the difference may also be the result of the home team having been assembled to take advantage of the idiosyncrasies of the home ballpark, such as the distances to the outfield walls; most other sports are played in standardized venues.

Win probability added is the change in win probability, often how a play or team member affected the probable outcome of the game. [2]

Win probability added (WPA) is a sport statistic which attempts to measure a player's contribution to a win by figuring the factor by which each specific play made by that player has altered the outcome of a game. It is used for baseball and American football.

Current research

Current research work involves measuring the accuracy of win probability estimates, as well as quantifying the uncertainty in individual estimates. [3] [4] That is, if a tool estimates a 24% win probability because 24% of previous teams in that situation won their games, do future teams win at the same 24% rate? Estimating from hidden data uses testing tools like cross-validation.

Cross-validation (statistics) statistical model validation technique

Cross-validation, sometimes called rotation estimation, or out-of-sample testing is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run, and a dataset of unknown data against which the model is tested. The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset.

While many models involve frequency analysis of past events, other models use Bayesian processes. [5]

Some models include a measure of teams' strength coming into the game, while others assume every team is average. Including strength estimates increases the number of possible states, and therefore decreases an estimate's power while possibly increasing its accuracy. [6]

Related Research Articles

Baseball statistics play an important role in evaluating the progress of a player or team.

Sabermetrics is the empirical analysis of baseball, especially baseball statistics that measure in-game activity.

Pythagorean expectation is a sports analytics formula devised by Bill James to estimate the percentage of games a baseball team "should" have won based on the number of runs they scored and allowed. Comparing a team's actual and Pythagorean winning percentage can be used to make predictions and evaluate which teams are over-performing and under-performing. The name comes from the formula's resemblance to the Pythagorean theorem.

Pitcher the player responsible for throwing ("pitching") the ball to the batters in a game of baseball or softball

In baseball, the pitcher is the player who throws the baseball from the pitcher's mound toward the catcher to begin each play, with the goal of retiring a batter, who attempts to either make contact with the pitched ball or draw a walk. In the numbering system used to record defensive plays, the pitcher is assigned the number 1. The pitcher is often considered the most important player on the defensive side of the game, and as such is situated at the right end of the defensive spectrum. There are many different types of pitchers, such as the starting pitcher, relief pitcher, middle reliever, lefty specialist, setup man, and the closer.

John Clarkson American baseball player

John Gibson Clarkson was a Major League Baseball right-handed pitcher. He played from 1882 to 1894. Born in Cambridge, Massachusetts, Clarkson played for the Worcester Ruby Legs (1882), Chicago White Stockings (1884–1887), Boston Beaneaters (1888–1892), and Cleveland Spiders (1892–1894).

Jung Bong South Korean baseball player

Bong Jung-keun is a South Korean former professional baseball player. He played in Major League Baseball with the Atlanta Braves and Cincinnati Reds, and in the KBO League with the LG Twins. He batted and threw left-handed.

Gus Weyhing American baseball player and manager

August "Gus" Weyhing was a pitcher for nine professional baseball teams in a career that spanned 14 years from 1887 to 1901. He was small for a pitcher by today's standards, listed at 5 feet 10 inches tall and between 120 and 145 pounds. He enjoyed quite a bit of success, winning 25 games or more for six years in a row, capped by a 32 win season in 1892 for the Philadelphia Phillies. He completed 46 of his 49 starts that year, hurled six shutouts, and logged 469​23 innings pitched. He collected 216 wins in his first eight professional seasons. Weyhing was nicknamed "Cannonball", "Rubber Arm", and "Rubber-Winged Gus". Weyhing was known to be a poor hitter and suspect fielder. He holds the dubious honor of having hit the most batters in a career (277).

An inning in baseball, softball, and similar games is the basic unit of play, consisting of two halves or frames, the "top" and the "bottom". In each half, one team bats until three outs are made, with the other team playing defense. A full game typically is scheduled for nine innings, although this may be shortened due to weather or extended if the score is tied at the end of the scheduled innings. Inning, in baseball and softball, is always singular, which contrasts with cricket and rounders in which the singular is "innings".

Todd Frazier American baseball player

Todd Brian Frazier, nicknamed The ToddFather and Flava Fraz, is an American professional baseball third baseman for the New York Mets of Major League Baseball (MLB). He previously played for the Cincinnati Reds, Chicago White Sox, and New York Yankees. Frazier is 6'3", 215 lbs, and right-handed. He played shortstop for the Rutgers Scarlet Knights in college. Frazier has also played first base.

Base runs (BsR) is a baseball statistic invented by sabermetrician David Smyth to estimate the number of runs a team "should have" scored given their component offensive statistics, as well as the number of runs a hitter or pitcher creates or allows. It measures essentially the same thing as Bill James' runs created, but as sabermetrician Tom M. Tango points out, base runs models the reality of the run-scoring process "significantly better than any other run estimator".

Alcides Escobar Venezuelan baseball player

Alcides Escobar[al-see'-des / es-co-bar'] is a Venezuelan professional baseball shortstop in the Chicago White Sox organization. He has played in Major League Baseball (MLB) for the Milwaukee Brewers and Kansas City Royals.

Earned run average

In baseball statistics, earned run average (ERA) is the mean of earned runs given up by a pitcher per nine innings pitched. It is determined by dividing the number of earned runs allowed by the number of innings pitched and multiplying by nine. Runs resulting from defensive errors are recorded as unearned runs and omitted from ERA calculations.

DJ LeMahieu American baseball player

David John LeMahieu is an American professional baseball second baseman for the New York Yankees of Major League Baseball (MLB). He previously played for the Chicago Cubs and Colorado Rockies.

Dee Gordon American baseball player

Devaris "Dee" Gordon is an American professional baseball second baseman, shortstop, and center fielder for the Seattle Mariners of Major League Baseball (MLB). He previously played for the Los Angeles Dodgers and Miami Marlins. With the Dodgers, Gordon was primarily a shortstop, and with the Marlins, he was primarily a second baseman. He began playing center field in 2018 with the Mariners. In 2015, in his first season with the Marlins, Gordon hit .333 with a total of 205 hits and stole 58 bases. He led the NL in all three categories and became the first NL player to lead the league in both batting average and stolen bases since fellow second baseman Jackie Robinson in 1949.

In baseball, wOBA is a statistic, based on linear weights, designed to measure a player's overall offensive contributions per plate appearance. It is formed from taking the observed run values of various offensive events, dividing by a player's plate appearances, and scaling the result to be on the same scale as on-base percentage. Unlike statistics like OPS, wOBA attempts to assign the proper value for each type of hitting event. It was created by Tom Tango and his coauthors for The Book: Playing the Percentages in Baseball.

Mike Trout American baseball player

Michael Nelson Trout is an American professional baseball center fielder for the Los Angeles Angels of Major League Baseball (MLB). Trout is a seven-time MLB All-Star, received the American League (AL) Most Valuable Player (MVP) award in 2014 and 2016, and is a six-time winner of the Silver Slugger Award. He is nicknamed "The Millville Meteor."

Marwin González Venezuelan baseball player

Marwin Javier González is a Venezuelan professional baseball utility player for the Minnesota Twins of Major League Baseball (MLB). He has also played in MLB for the Houston Astros. Primarily a shortstop, González has appeared at every position except for pitcher and catcher for the Astros. He won the 2017 World Series with the team over the Los Angeles Dodgers.

Pitch quantification is the attempt to describe the quality of a pitch using a single numeric value based on quantifiable aspects of an individual baseball pitch. There are two main kinds of pitch quantification. The first is outcome oriented. This means that the result of a given pitch is a component used to calculate the overall numeric value that describes the quality of the pitch. The other kind of pitch quantification does not consider the outcome of a pitch when calculating quality. Rather, it is batter independent. Its quality can be assessed without regard to what the batter does with the pitch.

References

  1. FanGraphs: Win Expectancy at the Wayback Machine (archived November 9, 2014)
  2. Win Probability and Win Probability Added Explained at the Wayback Machine (archived December 15, 2014)
  3. Tango, Tom (October 2, 2006). "Misunderstanding Win Expectancy".
  4. Tango, Tom; Lichtman, Mitchel; Dolphin, Andrew (2007). The Book: Playing the Percentages in Baseball. Potomac Books, Inc. ISBN   978-1-59797-129-4.
  5. Football Commentary: Description of the Dynamic Programming Model at the Wayback Machine (archived November 21, 2014)
  6. Sabermetrics 101: The Game State, Run Expectancy, and Win Expectancy at the Wayback Machine (archived April 11, 2014)