Expected goals

Last updated

Expected goals (xG) - the sum of the probabilities of goalscoring chances - a mathematical quantity calculated by taking into account the parameters of potential goalscoring chances, which include shots on goal and dangerous chances without finishing shots [1] . It is also used in ice hockey. [2] [3] .In association football, expected goals is a performance metric used to evaluate football team and player performance. [4]

Contents

Metric

Association football

Characteristics of the goal moment determining xG: coordinates, quality, body part, interference from the opponent (G. Kravtsov."Applied Statistics"). XG calculation activity.jpg
Characteristics of the goal moment determining xG: coordinates, quality, body part, interference from the opponent (G. Kravtsov."Applied Statistics").

There is some debate about the origin of the term expected goals. Vic Barnett and his colleague Sarah Hilditch referred to "expected goals" in their 1993 paper that investigated the effects of artificial pitch (AP) surfaces on home team performance in association football in England. [5] Their paper included this observation:

Quantitatively we find for the AP group about 0.15 more goals per home match than expected and, allowing for the lower than expected goals against in home matches, an excess goal difference (for home matches) of about 0.31 goals per home match. Over a season this yields about 3 more goals for, an improved goal difference of about 6 goals. [6]

Jake Ensum, Richard Pollard and Samuel Taylor (2004) reported their study of data from 37 matches in the 2002 World Cup in which 930 shots and 93 goals were recorded. [7] Their research sought "to investigate and quantify 12 factors that might affect the success of a shot". Their logistic regression identified five factors that had a significant effect on determining the success of a kicked shot: distance from the goal; angle from the goal; whether or not the player taking the shot was at least 1 m away from the nearest defender; whether or not the shot was immediately preceded by a cross; and the number of outfield players between the shot-taker and goal. [7] They concluded "the calculation of shot probabilities allows a greater depth of analysis of shooting opportunities in comparison to recording only the number of shots". [7] In a subsequent paper (2004), Ensum, Pollard and Taylor combined data from the 1986 and 2002 World Cup competitions to identify three significant factors that determined the success of a kicked shot: distance from the goal; angle from the goal; and whether or not the player taking the shot was at least 1 m away from the nearest defender. [8] More recent studies have identified similar factors as relevant for xG metrics. [9]

Howard Hamilton (2009) proposed "a useful statistic in soccer" that "will ultimately contribute to what I call an 'expected goal value' — for any action on the field in the course of a game, the probability that said action will create a goal". [10]

Sander Itjsma (2011) discussed "a method to assign different value to different chances created during a football match" and in doing so concluded: [11]

we now have a system in place in order to estimate the overall value of the chances created by either team during the match. Knowing how many goals a team is expected to score from its chances is of much more value than just knowing how many attempts to score a goal were made. Other applications of this method of evaluation would be to distinguish a lack of quality attempts created from a finishing problem or to evaluate defensive and goalkeeping performances. And a third option would be to plot the balance of play during the match in terms of the quality of chances created in order to graphically represent how the balance of play evolved during the match. [11]

Sarah Rudd (2011) discussed probable goal scoring patterns (P(Goal)) in her use of Markov chains for tactical analysis (including the proximity of defenders) from 123 games in the 2010-2011 English Premier League season. [12] In a video presentation of her paper at the 2011 New England Symposium of Statistics in Sport, Rudd reported her use of analysis methods to compare "expected goals" with actual goals and her process of applying weightings to incremental actions for P(goal) outcomes. [13]

In April 2012, Sam Green wrote about 'expected goals' in his assessment of Premier League goalscorers. [14] He asked "So how do we quantify which areas of the pitch are the most likely to result in a goal and therefore, which shots have the highest probability of resulting in a goal?". He added:

If we can establish this metric, we can then accurately and effectively increase our chances of scoring and therefore winning matches. Similarly, we can use this data from a defensive perspective to limit the better chances by defending key areas of the pitch. [14]

Green proposed a model to determine "a shot's probability of being on target and/or scored". With this model "we can look at each player's shots and tally up the probability of each of them being a goal to give an expected goal (xG) value". [14]

Ice hockey

In 2004, Alan Ryder shared a methodology for the study of the quality of an ice hockey shot on goal. His discussion started with this sentence “Not all shots on goal are created equal”. [15] Ryder's model for the measurement of shot quality was:

  • Collect the data and analyze goal probabilities for each shooting circumstance
  • Build a model of goal probabilities that relies on the measured circumstance
  • For each shot, determine its goal probability
  • Expected Goals: EG = the sum of the goal probabilities for each shot
  • Neutralize the variation in shots on goal by calculating Normalized Expected Goals
  • Shot Quality Against

Ryder concluded:

The model to get to expected goals given the shot quality factors is simply based on the data. There are no meaningful assumptions made. The analytic methods are the classics from statistics and actuarial science. The results are therefore very credible. [16]

In 2007, [2] Ryder issued a product recall notice for his shot quality model. He presented “a cautionary note on the calculation of shot quality” and pointed to “data quality problems with the measurement of the quality of a hockey team’s shots taken and allowed”. [2]

He reported:

I have been worried that there is a systemic bias in the data. Random errors don’t concern me. They even out over large volumes of data. But I do think that ... the scoring in certain rinks has a bias towards longer or shorter shots, the most dominant factor in a shot quality model. And I set out to investigate that possibility. [2]

The term 'expected goals' appeared in a paper about ice hockey performance presented by Brian Macdonald [3] at the MIT Sloan Sports Analytics Conference in 2012. Macdonald's method for calculating expected goals was reported in the paper:

We used data from the last four full NHL seasons. For each team, the season was split into two halves. Since midseason trades and injuries can have an impact on a team’s performance, we did not use statistics from the first half of the season to predict goals in the second half. Instead, we split the season into odd and even games, and used statistics from odd games to predict goals in even games. Data from 2007-08, 2008-09, and 2009-10 was used as the training data to estimate the parameters in the model, and data from the entire 2010-11 was set aside for validating the model. The model was also validated using 10-fold cross-validation. Mean squared error (MSE) of actual goals and predicted goals was our choice for measuring the performance of our models. [3]

Related Research Articles

<span class="mw-page-title-main">Elo rating system</span> Method for calculating relative skill levels of players

The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess. It is named after its creator Arpad Elo, a Hungarian-American physics professor.

<span class="mw-page-title-main">Midfielder</span> Association football position

In the sport of association football, a midfielder is an outfield position which plays primarily in the middle of the pitch. Midfielders may play an exclusively defensive role, breaking up attacks, and are in that case known as defensive midfielders. As central midfielders often go across boundaries, with mobility and passing ability, they are often referred to as deep-lying midfielders, play-makers, box-to-box midfielders, or holding midfielders. There are also attacking midfielders with limited defensive assignments.

<span class="mw-page-title-main">Association football tactics and skills</span> Notable football skills and tactics

Team tactics as well as individual skills are integral for playing association football. In theory, association football is a very simple game, as illustrated by Kevin Keegan's namely assertion that his tactics for winning a match were to "score more goals than the opposition". Tactical prowess within the sport is nonetheless a craftsmanship of its own, and one of the reasons why managers are paid well on the elite level. Well-organised and ready teams are often seen beating teams with more skillful players on paper. Manuals and books generally cover not only individual skills but tactics as well.

The outcome bias is an error made in evaluating the quality of a decision when the outcome of that decision is already known. Specifically, the outcome effect occurs when the same "behavior produce[s] more ethical condemnation when it happen[s] to produce bad rather than good outcome, even if the outcome is determined by chance."

<span class="mw-page-title-main">2003 Football League Cup final</span> Football match

The 2003 Football League Cup Final was a football match played between Liverpool and Manchester United on 2 March 2003 at the Millennium Stadium, Cardiff. It was the final match of the 2002–03 Football League Cup, the 43rd season of the Football League Cup, a football competition for the 92 teams in the Premier League and The Football League. Liverpool were appearing in their ninth final; they had previously won six and lost two, while Manchester United were appearing in the final for the fifth time. They had previously won once and lost three times.

<span class="mw-page-title-main">1995 Football League Cup final</span> Football match

The 1995 Football League Cup Final was a football match played between Liverpool and Bolton Wanderers on 2 April 1995 at Wembley Stadium, London. It was the final match of the 1994–95 Football League Cup, the 35th staging of the Football League Cup, a football competition for the 92 teams in the Premier League and The Football League. Liverpool were appearing in their seventh final, they had previously won four and lost twice. Bolton were appearing in their first final.

<span class="mw-page-title-main">1978 European Cup final</span> Football match

The 1978 European Cup final was an association football match between Liverpool of England and Club Brugge of Belgium on 10 May 1978 at Wembley Stadium, London, England. It was the final match of the 1977–78 season of Europe's premier cup competition, the European Cup. Liverpool were the reigning champions and were appearing in their second European Cup final. Club Brugge were appearing in their first European Cup final. The two sides had met once before in European competition, when they contested the 1976 UEFA Cup Final, which Liverpool won 4–3 on aggregate.

<span class="mw-page-title-main">2005 UEFA Super Cup</span> Football match

The 2005 UEFA Super Cup was an association football match between Liverpool of England and CSKA Moscow of Russia on 26 August 2005 at Stade Louis II, Monaco, the annual UEFA Super Cup contested between the winners of the UEFA Champions League and UEFA Cup. Liverpool were appearing in the Super Cup for the fifth time, having won the competition in 1977 and 2001. CSKA Moscow were appearing in the Super Cup for the first time, the first Russian team to appear in the competition.

<span class="mw-page-title-main">2006 FA Community Shield</span> Football match

The 2006 FA Community Shield was the 84th staging of the FA Community Shield, an annual football match played between the winners of the Premier League and FA Cup. The match was played between 2005–06 FA Cup winners Liverpool and 2005–06 Premier League champions Chelsea on 13 August 2006 at the Millennium Stadium, Cardiff. Chelsea were appearing in the competition for the sixth time, while Liverpool were making their 21st appearance. It was the final Community Shield to be held at the Millennium Stadium following the reconstruction of Wembley Stadium.

Shot Quality is a term used in the statistical analysis of ice hockey to indicate the probability that a given shot will result in a goal, based on factors such as the distance of the shot taken, the type of shot and other factors such as the number of players on the ice for each team. It is used to isolate the impact of goaltending performance. By comparing the number of goals allowed to the total of the Shot Quality figures for each shot against, a goaltender can be rated relative to average performance across the league.

Thorold Charles Reep was a British analyst credited with creating the long ball game, which has characterised English football.

<span class="mw-page-title-main">2005 FIFA Club World Championship final</span> Association football match

The 2005 FIFA Club World Championship final was an association football match played between São Paulo of Brazil, the CONMEBOL club champions, and Liverpool of England, the UEFA club champions, on 18 December 2005 at the International Stadium Yokohama, Japan. It was the final match of the 2005 FIFA Club World Championship, a competition for the winners of the primary cup competitions of FIFA's continental members. The Club World Championship replaced the Intercontinental Cup, which both teams had competed in before. São Paulo had won the Intercontinental Cup twice in 1992 and 1993, while Liverpool had lost twice in 1981 and 1984.

<span class="mw-page-title-main">2001 FA Charity Shield</span> Football match

The 2001 FA Charity Shield was the 79th FA Charity Shield, an annual football match played between the winners of the previous season's Premier League and FA Cup. The match was contested between Liverpool, winners of the 2000–01 FA Cup and Manchester United, who won the 2000–01 Premier League on 12 August 2001. It was the first Shield match to be held at the Millennium Stadium following the closure of Wembley Stadium for reconstruction. It was also the final time that the match was played under the FA Charity Shield name, as it was renamed to the FA Community Shield the following year.

<span class="mw-page-title-main">1981 Intercontinental Cup</span> Football match

The 1981 Intercontinental Cup was an association football match between Liverpool of England and Flamengo of Brazil on 13 December 1981 at the National Stadium in Tokyo, Japan. The annual Intercontinental Cup was contested between the winners of the European Cup and Copa Libertadores. Flamengo qualified for the Intercontinental Cup for the first time following their Copa Libertadores Cup success. Liverpool were also appearing in their first Intercontinental Cup. They had declined to take part in 1977 and 1978 after they won the European Cup. On 27 October 2017, following a meeting held in Kolkata, India, the FIFA Council recognised the winners of Intercontinental Cup as world champions.

Statistical Football prediction is a method used in sports betting, to predict the outcome of football matches by means of statistical tools. The goal of statistical match prediction is to outperform the predictions of bookmakers, who use them to set odds on the outcome of football matches.

The "hot hand" is a phenomenon, previously considered a cognitive social bias, that a person who experiences a successful outcome has a greater chance of success in further attempts. The concept is often applied to sports and skill-based tasks in general and originates from basketball, where a shooter is more likely to score if their previous attempts were successful; i.e., while having the "hot hand.” While previous success at a task can indeed change the psychological attitude and subsequent success rate of a player, researchers for many years did not find evidence for a "hot hand" in practice, dismissing it as fallacious. However, later research questioned whether the belief is indeed a fallacy. Some recent studies using modern statistical analysis have observed evidence for the "hot hand" in some sporting activities; however, other recent studies have not observed evidence of the "hot hand". Moreover, evidence suggests that only a small subset of players may show a "hot hand" and, among those who do, the magnitude of the "hot hand" tends to be small.

<span class="mw-page-title-main">Glossary of association football terms</span> List of definitions of terms and concepts used in football or soccer

Association football was first codified in 1863 in England, although games that involved the kicking of a ball were evident considerably earlier. A large number of football-related terms have since emerged to describe various aspects of the sport and its culture.

<span class="mw-page-title-main">Ajax 5–1 Liverpool (1966)</span> Football match

Ajax 5–1 Liverpool was a football match between Ajax and Liverpool on 7 December 1966 at the Olympic Stadium in Amsterdam, Netherlands. It was the first leg of a second round tie in the 1966–67 European Cup. The match was given the Dutch title "De Mistwedstrijd" as it was played in dense fog.

Winning and Score Predictor (WASP) is a calculation tool used in cricket to predict scores and possible results of a limited overs match, e.g. One Day and Twenty 20 matches.

In ice hockey, analytics is the analysis of the characteristics of hockey players and teams through the use of statistics and other tools to gain a greater understanding of the effects of their performance. Three commonly used basic statistics in ice hockey analytics are "Corsi" and "Fenwick", both of which use shot attempts to approximate puck possession, and "PDO", which is often considered a measure of luck. However, new statistics are being created every year, with "RAPM", regularized adjusted plus-minus, and "xG", expected goals, being created very recently in regards to hockey even though they have been around in other sports before. RAPM tries to isolate a players play driving ability based on multiple factors, while xG tries to show how many goals a player should be expected to add to their team independent of shooting and goalie talent.

References

  1. G. Kravtsov, Football Weekly N19 (2705) 2012 г. «Оцифровка. Балансовый отчет» (Зенит — ЦСКА).
  2. 1 2 3 4 Ryder, Alan (2007). "Product Recall Notice for 'Shot Quality'" (PDF). Retrieved 5 January 2018.
  3. 1 2 3 Macdonald, Brian (March 2012). "An Expected Goals Model for Evaluating NHL Teams and Players" (PDF). Retrieved 3 January 2018.
  4. "xG stats for teams and players from the TOP European leagues". understat.com.
  5. Barnett, Vic; Hilditch, S (1993). "The Effect of an Artificial Pitch Surface on Home Team Performance in Football (Soccer)". Journal of the Royal Statistical Society. Series A (Statistics in Society). 156 (1): 39–50. doi:10.2307/2982859. JSTOR   2982859.
  6. Barnett, Vic; Hilditch, S (1993). "The Effect of an Artificial Pitch Surface on Home Team Performance in Football (Soccer)". Journal of the Royal Statistical Society. Series A (Statistics in Society). 156 (1): 47. doi:10.2307/2982859. JSTOR   2982859.
  7. 1 2 3 Ensum, Jake; Pollard, Richard; Taylor, Samuel (2004). "Applications of logistic regression to shots at goal in association football: calculation of shot probabilities, quantification of factors and player/team". Journal of Sports Sciences. 22 (6): 504.
  8. Pollard, Richard; Ensum, Jake; Taylor, Samuel (2004). "Estimating the probability of a shot resulting in a goal: The effects of distance, angle and space". International Journal of Soccer and Science. 2 (1): 50–55.
  9. "An examination of expected goals and shot efficiency in soccer" (PDF). Universidad de Alicante.
  10. Hamilton, Howard (8 January 2009). "Moneyball and soccer" . Retrieved 6 February 2018.
  11. 1 2 Itjsma, Sander (13 July 2011). "A chance is a chance is a chance?" . Retrieved 4 January 2018.
  12. Rudd, Sarah (24 September 2011). "A Framework for Tactical Analysis and Individual Offensive Production Assessment in Soccer Using Markov Chains" (PDF). Retrieved 7 February 2018.
  13. 2011 NESSIS - Talk by Sarah Rudd on YouTube
  14. 1 2 3 Green, Sam (12 April 2012). "Assessing the performance of Premier League goalscorers" . Retrieved 4 January 2018.
  15. Ryder, Alan (January 2004). "Shot quality" (PDF). p. 2. Retrieved 4 January 2018.
  16. Ryder, Alan (January 2004). "Shot quality" (PDF). p. 15. Retrieved 5 January 2018.

Further reading