Jurimetrics is the application of quantitative methods, and often especially probability and statistics, to law. [1] In the United States, the journal Jurimetrics is published by the American Bar Association and Arizona State University. [2] The Journal of Empirical Legal Studies is another publication that emphasizes the statistical analysis of law.
The term was coined in 1949 by Lee Loevinger in his article "Jurimetrics: The Next Step Forward". [1] [3] Showing the influence of Oliver Wendell Holmes, Jr., Loevinger quoted [4] Holmes' celebrated phrase that:
“For the rational study of the law the blackletter man may be the man of the present, but the man of the future is the man of statistics and the master of economics.” [5]
The first work on this topic is attributed to Nicolaus I Bernoulli in his doctoral dissertation De Usu Artis Conjectandi in Jure, written in 1709.
In 2018, California's legislature passed Senate Bill 826, which requires all publicly held corporations based in the state to have a minimum number of women on their board of directors. [34] [35] Boards with five or fewer members must have at least two women, while boards with six or more members must have at least three women.
Using the binomial distribution, we may compute what the probability is of violating the rule laid out in Senate Bill 826 by the number of board members. The probability mass function for the binomial distribution is:
where is the probability of getting successes in trials, and is the binomial coefficient. For this computation, is the probability that a person qualified for board service is female, is the number of female board members, and is the number of board seats. We will assume that . Depending on the number of board members, we are trying compute the cumulative distribution function:
With these formulas, we are able to compute the probability of violating Senate Bill 826 by chance:
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|
0.50 | 0.31 | 0.19 | 0.34 | 0.23 | 0.14 | 0.09 | 0.05 | 0.03 | 0.02 |
As Ilya Somin points out, [34] a significant percentage of firms - without any history of sex discrimination - could be in violation of the law.
In more male-dominated industries, such as technology, there could be an even greater imbalance. Suppose that instead of parity in general, the probability that a person who is qualified for board service is female is 40%; this is likely to be a high estimate, given the predominance of males in the technology industry. Then the probability of violating Senate Bill 826 by chance may be recomputed as:
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
---|---|---|---|---|---|---|---|---|---|
0.65 | 0.48 | 0.34 | 0.54 | 0.42 | 0.32 | 0.23 | 0.17 | 0.12 | 0.08 |
Bayes' theorem states that, for events and , the conditional probability of occurring, given that has occurred, is:
Using the law of total probability, we may expand the denominator as:
Then Bayes' theorem may be rewritten as:
This may be simplified further by defining the prior odds of event occurring and the likelihood ratio as:
Then the compact form of Bayes' theorem is:
Different values of the posterior probability, based on the prior odds and likelihood ratio, are computed in the following table:
Likelihood Ratio | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
Prior Odds | 1 | 2 | 3 | 4 | 5 | 10 | 15 | 20 | 25 | 50 |
0.01 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.09 | 0.13 | 0.17 | 0.20 | 0.33 |
0.02 | 0.02 | 0.04 | 0.06 | 0.07 | 0.09 | 0.17 | 0.23 | 0.29 | 0.33 | 0.50 |
0.03 | 0.03 | 0.06 | 0.08 | 0.11 | 0.13 | 0.23 | 0.31 | 0.38 | 0.43 | 0.60 |
0.04 | 0.04 | 0.07 | 0.11 | 0.14 | 0.17 | 0.29 | 0.38 | 0.44 | 0.50 | 0.67 |
0.05 | 0.05 | 0.09 | 0.13 | 0.17 | 0.20 | 0.33 | 0.43 | 0.50 | 0.56 | 0.71 |
0.10 | 0.09 | 0.17 | 0.23 | 0.29 | 0.33 | 0.50 | 0.60 | 0.67 | 0.71 | 0.83 |
0.15 | 0.13 | 0.23 | 0.31 | 0.38 | 0.43 | 0.60 | 0.69 | 0.75 | 0.79 | 0.88 |
0.20 | 0.17 | 0.29 | 0.38 | 0.44 | 0.50 | 0.67 | 0.75 | 0.80 | 0.83 | 0.91 |
0.25 | 0.20 | 0.33 | 0.43 | 0.50 | 0.56 | 0.71 | 0.79 | 0.83 | 0.86 | 0.93 |
0.30 | 0.23 | 0.38 | 0.47 | 0.55 | 0.60 | 0.75 | 0.82 | 0.86 | 0.88 | 0.94 |
If we take to be some criminal behavior and a criminal complaint or accusation, Bayes' theorem allows us to determine the conditional probability of a crime being committed. More sophisticated analyses of evidence can be undertaken with the use of Bayesian networks.
In recent years, there has been a growing interest in the use of screening tests to identify drug users on welfare, potential mass shooters, [36] and terrorists. [37] The efficacy of screening tests can be analyzed using Bayes' theorem.
Suppose that there is some binary screening procedure for an action that identifies a person as testing positive or negative for the action. Bayes' theorem tells us that the conditional probability of taking action , given a positive test result, is:
For any screening test, we must be cognizant of its sensitivity and specificity. The screening test has sensitivity and specificity . The sensitivity and specificity can be analyzed using concepts from the standard theory of statistical hypothesis testing:
Therefore, the form of Bayes' theorem that is pertinent to us is:
Suppose that we have developed a test with sensitivity and specificity of 99%, which is likely to be higher than most real-world tests. We can examine several scenarios to see how well this hypothetical test works:
With these base rates and the hypothetical values of sensitivity and specificity, we may calculate the posterior probability that a positive result indicates the individual will actually engage in each of the actions:
Drug Use | Mass Shooting |
---|---|
0.6012 | 0.0098 |
Even with very high sensitivity and specificity, the screening tests only return posterior probabilities of 60.1% and 0.98% respectively for each action. Under more realistic circumstances, it is likely that screening would prove even less useful than under these hypothetical conditions. The problem with any screening procedure for rare events is that it is very likely to be too imprecise, which will identify too many people of being at risk of engaging in some undesirable action.
The difference between jurimetrics and law and economics is that jurimetrics investigates legal questions from a probabilistic/statistical point of view, while law and economics addresses legal questions using standard microeconomic analysis. A synthesis of these fields is possible through the use of econometrics (statistics for economic analysis) and other quantitative methods to answer relevant legal matters. As an example, the Columbia University scholar Edgardo Buscaglia published several peer-reviewed articles by using a joint jurimetrics and law and economics approach. [39] [40]
The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution, Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The Cauchy distribution is the distribution of the x-intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.
In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed. The theorem is a key concept in probability theory because it implies that probabilistic and statistical methods that work for normal distributions can be applicable to many problems involving other types of distributions. This theorem has seen many changes during the formal development of probability theory. Previous versions of the theorem date back to 1811, but in its modern general form, this fundamental result in probability theory was precisely stated as late as 1920, thereby serving as a bridge between classical and modern probability theory.
In statistics, the likelihood function measures the goodness of fit of a statistical model to a sample of data for given values of the unknown parameters. It is formed from the joint probability distribution of the sample, but viewed and used as a function of the parameters only, thus treating the random variables as fixed at the observed values.
Pareto efficiency or Pareto optimality is a situation where no individual or preference criterion can be better off without making at least one individual or preference criterion worse off or without any loss thereof. The concept is named after Vilfredo Pareto (1848–1923), Italian civil engineer and economist, who used the concept in his studies of economic efficiency and income distribution. The following three concepts are closely related:
In probability theory and statistics, Bayes' theorem, named after the Reverend Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For example, if the risk of developing health problems is known to increase with age, Bayes' theorem allows the risk to an individual of a known age to be assessed more accurately than simply assuming that the individual is typical of the population as a whole.
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, carpooling, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.
Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a degree of belief in an event. The degree of belief may be based on prior knowledge about the event, such as the results of previous experiments, or on personal beliefs about the event. This differs from a number of other interpretations of probability, such as the frequentist interpretation that views probability as the limit of the relative frequency of an event after many trials.
In probability theory, a Lévy process, named after the French mathematician Paul Lévy, is a stochastic process with independent, stationary increments: it represents the motion of a point whose successive displacements are random, in which displacements in pairwise disjoint time intervals are independent, and displacements in different time intervals of the same length have identical probability distributions. A Lévy process may thus be viewed as the continuous-time analog of a random walk.
The positive and negative predictive values are the proportions of positive and negative results in statistics and diagnostic tests that are true positive and true negative results, respectively. The PPV and NPV describe the performance of a diagnostic test or other statistical measure. A high result can be interpreted as indicating the accuracy of such a statistic. The PPV and NPV are not intrinsic to the test ; they depend also on the prevalence. Both PPV and NPV can be derived using Bayes' theorem.
The Granger causality test is a statistical hypothesis test for determining whether one time series is useful in forecasting another, first proposed in 1969. Ordinarily, regressions reflect "mere" correlations, but Clive Granger argued that causality in economics could be tested for by measuring the ability to predict the future values of a time series using prior values of another time series. Since the question of "true causality" is deeply philosophical, and because of the post hoc ergo propter hoc fallacy of assuming that one thing preceding another can be used as a proof of causation, econometricians assert that the Granger test finds only "predictive causality". Using the term "causality" alone is a misnomer, as Granger-causality is better described as "precedence", or, as Granger himself later claimed in 1977, "temporally related". Rather than testing whether Xcauses Y, the Granger causality tests whether X forecastsY.
In probability theory, an empirical process is a stochastic process that describes the proportion of objects in a system in a given state. For a process in a discrete state space a population continuous time Markov chain or Markov population model is a process which counts the number of objects in a given state . In mean field theory, limit theorems are considered and generalise the central limit theorem for empirical measures. Applications of the theory of empirical processes arise in non-parametric statistics.
Lindley's paradox is a counterintuitive situation in statistics in which the Bayesian and frequentist approaches to a hypothesis testing problem give different results for certain choices of the prior distribution. The problem of the disagreement between the two approaches was discussed in Harold Jeffreys' 1939 textbook; it became known as Lindley's paradox after Dennis Lindley called the disagreement a paradox in a 1957 paper.
In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.
In economics and game theory, an all-pay auction is an auction in which every bidder must pay regardless of whether they win the prize, which is awarded to the highest bidder as in a conventional auction.
In mathematics—specifically, in functional analysis—a weakly measurable function taking values in a Banach space is a function whose composition with any element of the dual space is a measurable function in the usual (strong) sense. For separable spaces, the notions of weak and strong measurability agree.
In applied mathematics, topological data analysis (TDA) is an approach to the analysis of datasets using techniques from topology. Extraction of information from datasets that are high-dimensional, incomplete and noisy is generally challenging. TDA provides a general framework to analyze such data in a manner that is insensitive to the particular metric chosen and provides dimensionality reduction and robustness to noise. Beyond this, it inherits functoriality, a fundamental concept of modern mathematics, from its topological nature, which allows it to adapt to new mathematical tools.
Statistical proof is the rational demonstration of degree of certainty for a proposition, hypothesis or theory that is used to convince others subsequent to a statistical test of the supporting evidence and the types of inferences that can be drawn from the test scores. Statistical methods are used to increase the understanding of the facts and the proof demonstrates the validity and logic of inference with explicit reference to a hypothesis, the experimental data, the facts, the test, and the odds. Proof has two essential aims: the first is to convince and the second is to explain the proposition through peer and public review.
Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.
Stochastic transitivity models are stochastic versions of the transitivity property of binary relations studied in mathematics. Several models of stochastic transitivity exist and have been used to describe the probabilities involved in experiments of paired comparisons, specifically in scenarios where transitivity is expected, however, empirical observations of the binary relation is probabilistic. For example, players' skills in a sport might be expected to be transitive, i.e. "if player A is better than B and B is better than C, then player A must be better than C"; however, in any given match, a weaker player might still end up winning with a positive probability. Tighly matched players might have a higher chance of observing this inversion while players with large differences in their skills might only see these inversions happen seldomly. Stochastic transitivity models formalize such relations between the probabilities and the underlying transitive relation.