Part of a series on |
Law |
---|
Foundations and Philosophy |
Legal theory |
Methodological background |
Legal debate |
Jurimetrics is the application of quantitative methods, especially probability and statistics, to law. [1] In the United States, the journal Jurimetrics is published by the American Bar Association and Arizona State University. [2] The Journal of Empirical Legal Studies is another publication that emphasizes the statistical analysis of law.
The term was coined in 1949 by Lee Loevinger in his article "Jurimetrics: The Next Step Forward". [1] [3] Showing the influence of Oliver Wendell Holmes Jr., Loevinger quoted [4] Holmes' celebrated phrase that:
"For the rational study of the law the blackletter man may be the man of the present, but the man of the future is the man of statistics and the master of economics." [5]
The first work on this topic is attributed to Nicolaus I Bernoulli in his doctoral dissertation De Usu Artis Conjectandi in Jure, written in 1709.
The difference between jurimetrics and law and economics is that jurimetrics investigates legal questions from a probabilistic/statistical point of view, while law and economics addresses legal questions using standard microeconomic analysis. Specifically, jurimetrics uncover patterns in decision-making and use them to identify potential biases in judgements that are passed. A synthesis of these fields is possible through the use of econometrics (statistics for economic analysis) and other quantitative methods to answer relevant legal matters. As an example, the Columbia University scholar Edgardo Buscaglia published several peer-reviewed articles by using a joint jurimetrics and law and economics approach. [6] [7]
In 2018, California's legislature passed Senate Bill 826, which requires all publicly held corporations based in the state to have a minimum number of women on their board of directors. [37] [38] Boards with five or fewer members must have at least two women, while boards with six or more members must have at least three women.
Using the binomial distribution, we may compute what the probability is of violating the rule laid out in Senate Bill 826 by the number of board members. The probability mass function for the binomial distribution is:where is the probability of getting successes in trials, and is the binomial coefficient. For this computation, is the probability that a person qualified for board service is female, is the number of female board members, and is the number of board seats. We will assume that .
Depending on the number of board members, we are trying compute the cumulative distribution function:With these formulas, we are able to compute the probability of violating Senate Bill 826 by chance:
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|
0.50 | 0.31 | 0.19 | 0.34 | 0.23 | 0.14 | 0.09 | 0.05 | 0.03 | 0.02 |
As Ilya Somin points out, [37] a significant percentage of firms - without any history of sex discrimination - could be in violation of the law.
In more male-dominated industries, such as technology, there could be an even greater imbalance. Suppose that instead of parity in general, the probability that a person who is qualified for board service is female is 40%; this is likely to be a high estimate, given the predominance of males in the technology industry. Then the probability of violating Senate Bill 826 by chance may be recomputed as:
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|
0.65 | 0.48 | 0.34 | 0.54 | 0.42 | 0.32 | 0.23 | 0.17 | 0.12 | 0.08 |
In recent years, there has been a growing interest in the use of screening tests to identify drug users on welfare, potential mass shooters, [39] and terrorists. [40] The efficacy of screening tests can be analyzed using Bayes' theorem.
Suppose that there is some binary screening procedure for an action that identifies a person as testing positive or negative for the action. Bayes' theorem tells us that the conditional probability of taking action , given a positive test result, is:For any screening test, we must be cognizant of its sensitivity and specificity. The screening test has sensitivity and specificity . The sensitivity and specificity can be analyzed using concepts from the standard theory of statistical hypothesis testing:
Therefore, the form of Bayes' theorem that is pertinent to us is:Suppose that we have developed a test with sensitivity and specificity of 99%, which is likely to be higher than most real-world tests. We can examine several scenarios to see how well this hypothetical test works:
With these base rates and the hypothetical values of sensitivity and specificity, we may calculate the posterior probability that a positive result indicates the individual will actually engage in each of the actions:
Drug Use | Mass Shooting |
---|---|
0.6012 | 0.0098 |
Even with very high sensitivity and specificity, the screening tests only return posterior probabilities of 60.1% and 0.98% respectively for each action. Under more realistic circumstances, it is likely that screening would prove even less useful than under these hypothetical conditions. The problem with any screening procedure for rare events is that it is very likely to be too imprecise, which will identify too many people of being at risk of engaging in some undesirable action.
Jurimetrics utilizes many statistical methods to analyze judicial behavior, and this occurs through uncovering patterns in decision-making and using them to identify potential biases in judgements that are passed. For instance, statistical analysis can forecast the outcomes of cases, providing insights into expected resolutions based on historical data. Jurimetrics is also used to evaluate litigation trends, optimize legal strategies, and improve the efficiency of legal proceedings. [42]
One example of an application of jurimetrics is through resource allocation within court systems, where data analytics are used to identify potential difficulties and suggests improvements. Another example is the analysis of disparities within sentencing. This allows policymakers to address the inequities within legal practices. These emphasize the role of jurimetrics in the legal system, as a way to bridge quantitative analysis, and equitable judicial processes. [42]
Bayes' theorem states that, for events and , the conditional probability of occurring, given that has occurred, is:Using the law of total probability, we may expand the denominator as:Then Bayes' theorem may be rewritten as:This may be simplified further by defining the prior odds of event occurring and the likelihood ratio as:Then the compact form of Bayes' theorem is:Different values of the posterior probability, based on the prior odds and likelihood ratio, are computed in the following table:
Likelihood Ratio | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Prior Odds | 1 | 2 | 3 | 4 | 5 | 10 | 15 | 20 | 25 | 50 |
0.01 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.09 | 0.13 | 0.17 | 0.20 | 0.33 |
0.02 | 0.02 | 0.04 | 0.06 | 0.07 | 0.09 | 0.17 | 0.23 | 0.29 | 0.33 | 0.50 |
0.03 | 0.03 | 0.06 | 0.08 | 0.11 | 0.13 | 0.23 | 0.31 | 0.38 | 0.43 | 0.60 |
0.04 | 0.04 | 0.07 | 0.11 | 0.14 | 0.17 | 0.29 | 0.38 | 0.44 | 0.50 | 0.67 |
0.05 | 0.05 | 0.09 | 0.13 | 0.17 | 0.20 | 0.33 | 0.43 | 0.50 | 0.56 | 0.71 |
0.10 | 0.09 | 0.17 | 0.23 | 0.29 | 0.33 | 0.50 | 0.60 | 0.67 | 0.71 | 0.83 |
0.15 | 0.13 | 0.23 | 0.31 | 0.38 | 0.43 | 0.60 | 0.69 | 0.75 | 0.79 | 0.88 |
0.20 | 0.17 | 0.29 | 0.38 | 0.44 | 0.50 | 0.67 | 0.75 | 0.80 | 0.83 | 0.91 |
0.25 | 0.20 | 0.33 | 0.43 | 0.50 | 0.56 | 0.71 | 0.79 | 0.83 | 0.86 | 0.93 |
0.30 | 0.23 | 0.38 | 0.47 | 0.55 | 0.60 | 0.75 | 0.82 | 0.86 | 0.88 | 0.94 |
If we take to be some criminal behavior and a criminal complaint or accusation, Bayes' theorem allows us to determine the conditional probability of a crime being committed. More sophisticated analyses of evidence can be undertaken with the use of Bayesian networks.
As many other fields, the changes to gurimetrics have been dynamic due to technological advancements. The integration of artificial intelligence(AI) into legal processes has been an emerging trend. Machine learning algorithms, an AI powered tool, have been used frequently to analyze legal texts, predict case outcomes, and provide data-focused insights to legal employees.
Technological advancements such as AI have been used in creating legal analytics platforms. They can review large amounts of case law, and identify patterns that assist in crafting legal arguments. These innovations improve decision-making processes by reducing the likelihood of human error, but also increase the efficiency of legal research. [43]
For example, recent studies highlight the efficiency of ML in analyzing complex datasets, such as those found in healthcare or legal domains, with high accuracy. One application discussed by Christian Garbin, Nicholas Marques, and Oge Marques (2023) involves the use of ML models to identify specific patterns in datasets characterized by class imbalances. The article discusses datasets related to opioid use disorder (OUD), and how judgements passed in legal environments have been dependent on these datasets that are connected closely to class imbalances [44] .
Despite many advancements, the integration of AI into jurimetrics presents challenges. Garbin, Marques, and Marques emphasize that many studies that use machine learning algorithms fail to transparently document essential steps, such as data preprocessing, hyperparameter tuning, or the criteria used for splitting training and test sets [44] .
Garbin, Marques, and Marques recommend prioritizing interpretable models unless the performance gap justifies the use of less transparent algorithms. Since legal decisions have high-stakes, interpretable models(logistic regression or decision trees) are often preferred over more complex "black-box" models. Often, these "black-box" models have higher predictive accuracy, but the interpretability is a central and ethical concern. [44]
The term "jurimetrics" was created in 1949 by Lee Loevinger. [45] It was defined as the use of quantitative methods to the study of law. Initially, jurimetrics was specifically focused on the theoretical exploration of statistical techniques on legal systems. [43]
Over time, the field evolved. In the mid-20th century, jurimetrics began to gain traction as researchers continued to explore the field and its potential for improving legal analysis. Early foundational studies created a roadmap for actually integrating the practice into the legal field. By the late 20th century, jurimetrics expanded to include applications such as evaluating the reliability of forensic evidence and modeling litigation outcomes.
In today's world, jurimetrics is recognized as a tool for the modern day legal system. It bridges the gaps between economics, data science, and the law.
In 2021, Abigail Z. Jacobs and Hanna Wallach released a study regarding "computational systems, and how they often involve unobservable theoretical constructs, such as socioeconomic status, teacher effectiveness, and risk of recidivism" [46] . "Computational systems have long been touted as having the potential to counter societal biases and structural inequalities, yet recent work has demonstrated that they often end up encoding and exacerbating them instead" [46] .
An example of the ethical concerns in jurimetrics comes from risk assessment models used in the U.S. justice system, particularly seen in the Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) tool. COMPAS is developed by Northpointe(now Equivant), and was built to evaluate a defendant's likelihood of recidivism through the analysis of various factors derived from official records and interviews. The factors are grouped into four dimensions: prior criminal history, associations with criminals, drug involvement, and indicators of juvenile delinquency.
The risk assessment model then uses the factors in a regression model to generate a recidivism risk score, scaled from one to ten, with ten indicating the highest risk. According to the model, recidivism is defined as a new misdemeanor or felony arrest within two years. However, the specific mathematical methodology that COMPAS uses remains private, which has raised concerns regarding transparency. Subsequent investigations, such as those by Angwin [47] et al., have critiqued the model for potential biases and their ethical implications. [48]
Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.
Bayes' theorem gives a mathematical rule for inverting conditional probabilities, allowing us to find the probability of a cause given its effect. For example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to an individual of a known age to be assessed more accurately by conditioning it relative to their age, rather than assuming that the individual is typical of the population as a whole. Based on Bayes law both the prevalence of a disease in a given population and the error rate of an infectious disease test have to be taken into account to evaluate the meaning of a positive test result correctly and avoid the base-rate fallacy.
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to calculate a probability of a hypothesis, given prior evidence, and update it as more information becomes available. Fundamentally, Bayesian inference uses a prior distribution to estimate posterior probabilities. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.
In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in honor of Bruno de Finetti.
In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some mathematical space.
In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability, where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation, which views probability as the limit of the relative frequency of an event after many trials. More concretely, analysis in Bayesian methods codifies prior knowledge in the form of a prior distribution.
A prior probability distribution of an uncertain quantity, often simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the relative proportions of voters who will vote for a particular politician in a future election. The unknown quantity may be a parameter of the model or a latent variable rather than an observable variable.
The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969. Ordinarily, regressions reflect "mere" correlations, but Clive Granger argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of "true causality" is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger test finds only "predictive causality". Using the term "causality" alone is a misnomer, as Granger-causality is better described as "precedence", or, as Granger himself later claimed in 1977, "temporally related". Rather than testing whether Xcauses Y, the Granger causality tests whether X forecastsY.
In game theory, a Bayesian game is a strategic decision-making model which assumes players have incomplete information. Players may hold private information relevant to the game, meaning that the payoffs are not common knowledge. Bayesian games model the outcome of player interactions using aspects of Bayesian probability. They are notable because they allowed, for the first time in game theory, for the specification of the solutions to games with incomplete information.
Bootstrapping is a procedure for estimating the distribution of an estimator by resampling one's data or a model estimated from the data. Bootstrapping assigns measures of accuracy to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods.
Bayesian inference of phylogeny combines the information in the prior and in the data likelihood to create the so-called posterior probability of trees, which is the probability that the tree is correct given the data, the prior and the likelihood model. Bayesian inference was introduced into molecular phylogenetics in the 1990s by three independent groups: Bruce Rannala and Ziheng Yang in Berkeley, Bob Mau in Madison, and Shuying Li in University of Iowa, the last two being PhD students at the time. The approach has become very popular since the release of the MrBayes software in 2001, and is now one of the most popular methods in molecular phylogenetics.
In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.
Polynomial chaos (PC), also called polynomial chaos expansion (PCE) and Wiener chaos expansion, is a method for representing a random variable in terms of a polynomial function of other random variables. The polynomials are chosen to be orthogonal with respect to the joint probability distribution of these random variables. Note that despite its name, PCE has no immediate connections to chaos theory. The word "chaos" here should be understood as "random".
In economics and game theory, an all-pay auction is an auction in which every bidder must pay regardless of whether they win the prize, which is awarded to the highest bidder as in a conventional auction. As shown by Riley and Samuelson (1981), equilibrium bidding in an all pay auction with private information is revenue equivalent to bidding in a sealed high bid or open ascending price auction.
Bayesian econometrics is a branch of econometrics which applies Bayesian principles to economic modelling. Bayesianism is based on a degree-of-belief interpretation of probability, as opposed to a relative-frequency interpretation.
In Bayesian inference, the Bernstein–von Mises theorem provides the basis for using Bayesian credible sets for confidence statements in parametric models. It states that under some conditions, a posterior distribution converges in total variation distance to a multivariate normal distribution centered at the maximum likelihood estimator with covariance matrix given by , where is the true population parameter and is the Fisher information matrix at the true population parameter value:
In mathematics — specifically, in the fields of probability theory and inverse problems — Besov measures and associated Besov-distributed random variables are generalisations of the notions of Gaussian measures and random variables, Laplace distributions, and other classical distributions. They are particularly useful in the study of inverse problems on function spaces for which a Gaussian Bayesian prior is an inappropriate model. The construction of a Besov measure is similar to the construction of a Besov space, hence the nomenclature.
Stochastic transitivity models are stochastic versions of the transitivity property of binary relations studied in mathematics. Several models of stochastic transitivity exist and have been used to describe the probabilities involved in experiments of paired comparisons, specifically in scenarios where transitivity is expected, however, empirical observations of the binary relation is probabilistic. For example, players' skills in a sport might be expected to be transitive, i.e. "if player A is better than B and B is better than C, then player A must be better than C"; however, in any given match, a weaker player might still end up winning with a positive probability. Tightly matched players might have a higher chance of observing this inversion while players with large differences in their skills might only see these inversions happen seldom. Stochastic transitivity models formalize such relations between the probabilities and the underlying transitive relation.
{{cite web}}
: CS1 maint: multiple names: authors list (link){{cite web}}
: CS1 maint: multiple names: authors list (link)